API Reference

class crabwalk.Walk(*paths, max_depth=None, follow_symlinks=False, max_filesize=None, global_ignore_files=None, custom_ignore_filenames=None, overrides=None, types=None, hidden=True, parents=True, ignore=True, git_global=True, git_ignore=True, git_exclude=True, require_git=True, ignore_case_insensitive=False, sort=None, same_file_system=False, skip_stdout=False, filter_entry=None, onerror=None)

Recursive directory iterator which yields DirEntry objects.

If Walk is not closed (either by using a with statement or calling close() explicitly) then a ResourceWarning will be emitted in its destructor.

Parameters:
  • paths (Union[str, os.PathLike[str]]) – Paths to iterate recursively.

  • max_depth (Optional[int]) – The maximum depth to recurse.

  • follow_symlinks (bool) – Whether to follow symbolic links or not.

  • max_filesize (Optional[int]) – Whether to ignore files above the specified limit.

  • global_ignore_files (Sequence[Union[str, os.PathLike[str]]]) – Paths to global ignore files. These have lower precedence than all other sources of ignore rules.

  • custom_ignore_filenames (Sequence[str]) – Custom ignore file names. These have higher precedence than all other ignore files.

  • overrides (Optional[Overrides]) – Add an override matcher.

  • types (Optional[Types]) – Add a file type matcher.

  • hidden (bool) – Enables ignoring hidden files.

  • parents (bool) – Enables reading ignore files from parent directories. When enabled, .gitignore files in parent directories of each file path given are respected. Otherwise, they are ignored.

  • ignore (bool) –

    Enables reading .ignore files.

    .ignore files have the same semantics as gitignore files and are supported by search tools such as ripgrep and The Silver Searcher.

  • git_global (bool) –

    Enables reading a global gitignore file, whose path is specified in git’s core.excludesFile config option.

    Git’s config file location is $HOME/.gitconfig. If $HOME/.gitconfig does not exist or does not specify core.excludesFile, then $XDG_CONFIG_HOME/git/ignore is read. If $XDG_CONFIG_HOME is not set or is empty, then $HOME/.config/git/ignore is used instead.

  • git_ignore (bool) – Enables reading .gitignore files.

  • git_exclude (bool) –

    Enables reading .git/info/exclude files.

    .git/info/exclude files have match semantics as described in the gitignore man page.

  • require_git (bool) – Whether a git repository is required to apply git-related ignore rules (global rules, .gitignore and local exclude rules).

  • ignore_case_insensitive (bool) – Process ignore files case insensitively.

  • sort (Union[Callable[[str], SupportsRichComparison], bool]) – May be true to sort entries by file path, or a callable to extract a comparison key based on the file path (like the key argument to sorted()).

  • same_file_system (bool) – Do not cross file system boundaries.

  • skip_stdout (bool) –

    Do not yield directory entries that are believed to correspond to stdout.

    This is useful when a command is invoked via shell redirection to a file that is also being read. For example, grep -r foo ./ > results might end up trying to search results even though it is also writing to it, which could cause an unbounded feedback loop. Setting this option prevents this from happening by skipping over the results file.

  • filter_entry (Optional[Callable[[DirEntry], bool]]) – Yields only entries which satisfy the given predicate and skips descending into directories that do not satify the given predicate.

  • onerror (Optional[Callable[[Exception], None]]) – By default, errors are ignored. You may specify a function to either log the error or re-raise it.

disable_standard_filters() None

Disable the hidden, parents, ignore, git_ignore, git_global, and git_exclude filters.

enable_standard_filters() None

Enable the hidden, parents, ignore, git_ignore, git_global, and git_exclude filters.

close() None

Close the iterator and free acquired resources

It is recommended to use a with statement instead.

class crabwalk.DirEntry

Object yielded by Walk to expose the file path and other file attributes of a directory entry.

The interface is similar—but not identical—to os.DirEntry.

DirEntry implements the os.PathLike interface.

name: str

Return the base filename of this entry.

os.DirEntry difference

If this entry has no file name (e.g., /), then the full path is returned.

path: str

The entry’s full path name. The path is only absolute if the Walk path argument was absolute.

inode() int

Return the inode number of the entry.

If follow_symlinks=True and this entry is a symbolic link, the inode number of the target is returned.

os.DirEntry difference

In contrast, os.DirEntry always returns the inode number of the symbolic link itself.

Use os.stat(entry, follow_symlinks=False).st_ino if that’s what you want.

On the first, uncached call, a system call is required on Windows but not on Unix.

is_dir() bool

Returns whether this entry is a directory or if Walk was configured with follow_symlinks=True and this is a symbolic link pointing to a directory.

Never makes any system calls.

os.DirEntry difference

There is no follow_symlinks argument, as it is configured via Walk.

is_file() bool

Returns whether this entry is a file or if Walk was configured with follow_symlinks=True and this is a symbolic link pointing to a file.

Never makes any system calls.

os.DirEntry difference

There is no follow_symlinks argument, as it is configured via Walk.

Returns whether this entry is a symbolic link.

stat() os.stat_result

Returns a stat_result object for this entry. Follows symbolic links if Walk was configured with follow_symlinks=True.

Always makes a system call.

os.DirEntry difference

There is no follow_symlinks argument, as it is configured via Walk.

depth: int

The depth at which this entry was created relative to the root.

Whether this entry is configured to follow symlinks or not (inherited from the Walk instance which yielded it).

class crabwalk.Types(initial=None, /, **kwargs)

A collection of type definitions with selections and negations.

Types implements the MutableMapping interface.

Typical usage:

types = Types()
types.add_defaults()
types.select("...")
types.negate("...")

with Walk(..., types=types) as walk:
    ...

Note

A Types instance without any selections or negations won’t affect the output of Walk.

add(name: str, glob: str) None

Add glob to type with name name.

>>> types = Types()
>>> types.add("py", "*.py")
>>> types.add("py", "*.pyi")
>>> types["py"]
('*.py', '*.pyi')
add_defaults() None
>>> types = Types()
>>> types.add_defaults()
>>> types["py"]
('*.py',)
select(name: str) None

Select the file type given by name.

If name is all, then all file types currently defined are selected.

negate(name: str) None

Ignore the file type given by name.

If name is all, then all file types currently defined are ignored.

class crabwalk.Override(glob, case_insensitive=False)

A namedtuple used by Overrides.

Parameters:
  • glob (str) – A glob with the same semantics as a single line in a .gitignore file, where the meaning of ! is inverted: namely, ! at the beginning of a glob will ignore a file. Without !, all matches of the glob provided are treated as whitelist matches.

  • case_insensitive (bool) – Whether this glob should be matched case insensitively or not.

class crabwalk.Overrides(overrides=(), *, path)

A MutableSequence of Override tuples.

Strings and tuples will be coerced to Override instances.

Parameters:
  • overrides (Iterable[str | tuple[str, bool]]) –

    An iterable of globs, (glob, case_insensitive) tuples, or Override namedtuples.

    >>> o = Overrides(["*.py", ("*.pyi", True), Override("!*.pyc")], path=".")
    >>> assert o[0] == Override("*.py", False)
    >>> assert o[1] == Override("*.pyi", True)
    >>> assert o[2] == Override("!*.pyc", False)
    

  • path (Union[str, os.PathLike[str]]) – Globs are matched relative to this path.

Exceptions

exception crabwalk.WalkError

Base class for all exceptions raised by the crabwalk package.

line

A line number if this error is associated with one.

path

A file path is this error is associated with one.

depth

A directory depth if this error is associated with one while recursively walking a directory.

exception crabwalk.LoopError

An error that occurs when a file loop is detected when traversing symbolic links.

exception crabwalk.GlobError

An error that occurs when trying to parse a glob.

exception crabwalk.PartialError

A collection of “soft” errors. These occur when adding an ignore file partially succeeded.

exception crabwalk.InvalidDefinitionError

A user specified file type definition could not be parsed.

exception crabwalk.UnrecognizedFileTypeError

A type selection for a file type that is not defined.