API Reference#

class crabwalk.Walk(*paths, max_depth=None, follow_links=False, max_filesize=None, global_ignore_files=None, custom_ignore_filenames=None, overrides=None, types=None, hidden=True, parents=True, ignore=True, git_global=True, git_ignore=True, git_exclude=True, require_git=True, ignore_case_insensitive=False, sort=None, same_file_system=False, skip_stdout=False, filter_entry=None, onerror=None)#

Recursive directory iterator which yields DirEntry objects.

If Walk is not closed (either by using a with statement or calling close() explicitly) then a ResourceWarning will be emitted in its destructor.

Parameters:
  • paths (Union[str, os.PathLike[str]]) – Paths to iterate recursively.

  • max_depth (Optional[int]) – The maximum depth to recurse.

  • follow_links (bool) – Whether to follow symbolic links or not.

  • max_filesize (Optional[int]) – Whether to ignore files above the specified limit.

  • global_ignore_files (Sequence[Union[str, os.PathLike[str]]]) – Paths to global ignore files. These have lower precedence than all other sources of ignore rules.

  • custom_ignore_filenames (Sequence[str]) – Custom ignore file names. These have higher precedence than all other ignore files.

  • overrides (Optional[Overrides]) – Add an override matcher.

  • types (Optional[Types]) – Add a file type matcher.

  • hidden (bool) – Enables ignoring hidden files.

  • parents (bool) – Enables reading ignore files from parent directories. When enabled, .gitignore files in parent directories of each file path given are respected. Otherwise, they are ignored.

  • ignore (bool) –

    Enables reading .ignore files.

    .ignore files have the same semantics as gitignore files and are supported by search tools such as ripgrep and The Silver Searcher.

  • git_global (bool) –

    Enables reading a global gitignore file, whose path is specified in git’s core.excludesFile config option.

    Git’s config file location is $HOME/.gitconfig. If $HOME/.gitconfig does not exist or does not specify core.excludesFile, then $XDG_CONFIG_HOME/git/ignore is read. If $XDG_CONFIG_HOME is not set or is empty, then $HOME/.config/git/ignore is used instead.

  • git_ignore (bool) – Enables reading .gitignore files.

  • git_exclude (bool) –

    Enables reading .git/info/exclude files.

    .git/info/exclude files have match semantics as described in the gitignore man page.

  • require_git (bool) – Whether a git repository is required to apply git-related ignore rules (global rules, .gitignore and local exclude rules).

  • ignore_case_insensitive (bool) – Process ignore files case insensitively.

  • sort (Union[Callable[[str], SupportsRichComparison], bool]) – May be true to sort entries by file path, or a callable to extract a comparison key based on the file path (like the key argument to sorted()).

  • same_file_system (bool) – Do not cross file system boundaries.

  • skip_stdout (bool) –

    Do not yield directory entries that are believed to correspond to stdout.

    This is useful when a command is invoked via shell redirection to a file that is also being read. For example, grep -r foo ./ > results might end up trying to search results even though it is also writing to it, which could cause an unbounded feedback loop. Setting this option prevents this from happening by skipping over the results file.

  • filter_entry (Optional[Callable[[DirEntry], bool]]) – Yields only entries which satisfy the given predicate and skips descending into directories that do not satify the given predicate.

  • onerror (Optional[Callable[[Exception], None]]) – By default, errors are ignored. You may specify a function to either log the error or re-raise it.

disable_standard_filters() None#

Disable the hidden, parents, ignore, git_ignore, git_global, and git_exclude filters.

enable_standard_filters() None#

Enable the hidden, parents, ignore, git_ignore, git_global, and git_exclude filters.

close() None#

Close the iterator and free acquired resources

It is recommended to use a with statement instead.

class crabwalk.DirEntry#

Object yielded by Walk to expose the file path and other file attributes of a directory entry.

The interface is similar—but not identical—to os.DirEntry.

DirEntry implements the os.PathLike interface.

name: str#

Return the base filename of this entry.

If this entry has no file name (e.g., /), then the full path is returned.

path: str#

The entry’s full path name. The path is only absolute if the Walk path argument was absolute.

inode() int#

Return the inode number of the entry.

Caution

If follow_links=True and this entry is a symbolic link, the inode number of the target is returned. This is different from os.DirEntry which always returns the inode number of the symbolic link itself.

Use os.stat(entry.path, follow_symlinks=False).st_ino if that’s what you want.

On the first, uncached call, a system call is required on Windows but not on Unix.

is_dir() bool#

Returns whether this entry is a directory or if Walk was configured with follow_links=True and this is a symbolic link pointing to a directory.

is_file() bool#

Returns whether this entry is a file or if Walk was configured with follow_links=True and this is a symbolic link pointing to a file.

Returns whether this entry is a symbolic link.

depth: int#

The depth at which this entry was created relative to the root.

class crabwalk.Types(initial=None, /, **kwargs)#

A collection of type definitions with selections and negations.

Types implements the MutableMapping interface.

add(name: str, glob: str) None#

Add glob to type with name name.

>>> types = Types()
>>> types.add("py", "*.py")
>>> types.add("py", "*.pyi")
>>> types["py"]
('*.py', '*.pyi')
add_defaults() None#
>>> types = Types()
>>> types.add_defaults()
>>> types["py"]
('*.py',)
select(name: str) None#

Select the file type given by name.

If name is all, then all file types currently defined are selected.

negate(name: str) None#

Ignore the file type given by name.

If name is all, then all file types currently defined are ignored.

class crabwalk.Override(glob, case_insensitive=False)#

A namedtuple used by Overrides.

Parameters:
  • glob (str) – A glob with the same semantics as a single line in a .gitignore file, where the meaning of ! is inverted: namely, ! at the beginning of a glob will ignore a file. Without !, all matches of the glob provided are treated as whitelist matches.

  • case_insensitive (bool) – Whether this glob should be matched case insensitively or not.

class crabwalk.Overrides(overrides=(), *, path)#

A MutableSequence of Override tuples.

Strings and tuples will be coerced to Override instances.

Parameters:
  • overrides (Iterable[str | Tuple[str, bool]]) –

    An iterable of globs, (glob, case_insensitive) tuples, or Override namedtuples.

    >>> o = Overrides(["*.py", ("*.pyi", True), Override("!*.pyc")], path=".")
    >>> assert o[0] == Override("*.py", False)
    >>> assert o[1] == Override("*.pyi", True)
    >>> assert o[2] == Override("!*.pyc", False)
    

  • path (Union[str, os.PathLike[str]]) – Globs are matched relative to this path.

Exceptions#

exception crabwalk.WalkError#

Base class for all exceptions raised by the crabwalk package.

line#

A line number if this error is associated with one.

path#

A file path is this error is associated with one.

depth#

A directory depth if this error is associated with one while recursively walking a directory.

exception crabwalk.LoopError#

An error that occurs when a file loop is detected when traversing symbolic links.

exception crabwalk.GlobError#

An error that occurs when trying to parse a glob.

exception crabwalk.PartialError#

A collection of “soft” errors. These occur when adding an ignore file partially succeeded.

exception crabwalk.InvalidDefinitionError#

A user specified file type definition could not be parsed.

exception crabwalk.UnrecognizedFileTypeError#

A type selection for a file type that is not defined.