Skip to main content

List directory contents as Polars DataFrames

Project description

pols

List directory contents as Polars DataFrames.

Installation

The polars-ls package can be installed with either polars or polars-lts-cpu using the extras by those names:

pip install polars-ls[polars]
pip install polars-ls[polars-lts-cpu]

If Polars is already installed, you can simply pip install polars-ls.

User guidance

Names are relative

Counter to the typical pathlib.Path notion of a name, the names in ls and hence pols are more relative names: hence . is a valid name (if you try accessing the .name attribute of a pathlib Path, it'll come back as "").

>>> cwd = Path.cwd()
>>> cwd / "."
PosixPath('/home/louis/dev/pols')
>>> cwd / ".."
PosixPath('/home/louis/dev/pols/..')
>>> (cwd / ".").name
'pols'
>>> (cwd / "..").name
'..'
>>> Path(".").name
''

Individual files and directories don't mix

The way ls works is that individual files get collected in one 'set' of results and directories in another, and never the two shall meet. If you ls a few files and one or more directories, you'll get one set of reults with all the files and one set for each of the folders. This is because of the previous point: the names shown are relative to the directory 'root' (if you're specifying files individually, the current working directory is the assumed directory 'root', and of course absolute paths always show as absolute so their 'root' is shown too).

(Even if the individual files are in different folders: it's because merging files with different roots whose relative names are being shown would be invalid)

$ ls README.md src src/pols/__init__.py 
README.md  src/pols/__init__.py

src:
pols

To the same effect, the results are grouped in a list of dicts, where the key is the source (either the empty string for the individual files, or the directory root). This allows an identical printout style to ls:

$ ls -A ../.py*
../.python-version

../.pytest_cache:
CACHEDIR.TAG  .gitignore  README.md  v
$ pols -A ../.py*
shape: (1, 1)
┌────────────────────┐
│ name               │
│ ---                │
│ str                │
╞════════════════════╡
│ ../.python-version │
└────────────────────┘
../.pytest_cache:
shape: (4, 1)
┌──────────────┐
│ name         │
│ ---          │
│ str          │
╞══════════════╡
│ README.md    │
│ v            │
│ .gitignore   │
│ CACHEDIR.TAG │
└──────────────┘

Globs (Kleene stars) go 1 level deep

You can use ** in ls and pols but in both cases you only actually get one level, unlike other tools (and Python's glob).

$ ls src/pols/**.py
src/pols/cli.py  src/pols/__init__.py  src/pols/pols.py
$ ls src/pols/*/*.py
src/pols/features/a.py  src/pols/features/A.py  src/pols/features/hide.py
src/pols/features/__init__.py  src/pols/features/p.py

Patterns that don't match will error non-fatally

It's allowed to not match a file, just like in ls:

$ ls *.yaml *.toml *.md
ls: cannot access '*.yaml': No such file or directory
 pyproject.toml   README.md

$ pols *.yaml *.toml *.md
pols: cannot access '*.yaml': No such file or directory
shape: (2, 1)
┌────────────────┐
│ name           │
│ ---            │
│ str            │
╞════════════════╡
│ pyproject.toml │
│ README.md      │
└────────────────┘

OSErrors like FileNotFoundError are non-fatal but can be thrown with raise_on_access

If you want such errors to be fatal, pass raise_on_acecss (--raise-on-access on the command line):

$ pols *.yaml *.toml *.md --raise-on-access
pols: cannot access '*.yaml': No such file or directory
Traceback (most recent call last):
...
FileNotFoundError: No such file or directory

Note that the file expansion and preparation is done before any printing or DataFrame operations, so these errors won't occur mid-way through any Polars computations.

Sorting is applied in the same order given

(Note: so far this only applies for the command line)

Just like ls, command line order affects pols for sorting flags. The sorts are applied in order their flags are given, setting the Polars .sort(maintain_order=True) parameter.

See Polars docs for more information.

$ ls -St
pols.py  features  walk.py  resegment.py  cli.py  __init__.py
$ pols --S --t
shape: (6, 1)
┌──────────────┐
│ name         │
│ ---          │
│ str          │
╞══════════════╡
│ pols.py      │
│ features     │
│ walk.py      │
│ resegment.py │
│ __init__.py  │
│ cli.py       │
└──────────────┘
$ ls -tS
pols.py  walk.py  features  resegment.py  cli.py  __init__.py
$ pols --t --S
shape: (6, 1)
┌──────────────┐
│ name         │
│ ---          │
│ str          │
╞══════════════╡
│ pols.py      │
│ walk.py      │
│ features     │
│ resegment.py │
│ cli.py       │
│ __init__.py  │
└──────────────┘

Differences from ls

The design is intended to keep as closely as possible to GNU coreutils ls.

1. hide is not disabled by a/A

Another is that hide is not disabled by a/A because there is no need to, and this enables filtering hidden files minus some pattern. In ls, --hide silently fails if passed with -a.

2. Name sort order is standard lexicographic sort

One other is that lexicographic name sort is different: I am just using regular Polars sort, ls appears to do one where _ is ignored, compare:

$ ls -l
total 84
-rw-rw-r-- 1 louis louis    81 Feb  1 19:13 cli.py
drwxrwxr-x 2 louis louis  4096 Feb  3 16:05 features
-rw-rw-r-- 1 louis louis    23 Feb  1 19:13 __init__.py
-rw-rw-r-- 1 louis louis 20138 Feb  3 15:54 pols.py
-rw-rw-r-- 1 louis louis   651 Feb  3 03:19 resegment.py
-rw-rw-r-- 1 louis louis  6543 Feb  3 12:43 walk.py
$ pols --l
shape: (6, 6)
┌──────────────┬─────────────┬───────┬───────┬───────┬─────────────────────────┐
│ name          permissions  owner  group  size   time                    │
│ ---           ---          ---    ---    ---    ---                     │
│ str           str          str    str    i64    datetime[ms]            │
╞══════════════╪═════════════╪═══════╪═══════╪═══════╪═════════════════════════╡
│ __init__.py   -rw-rw-r--   louis  louis  23     2025-02-01 19:13:57.460 │
│ cli.py        -rw-rw-r--   louis  louis  81     2025-02-01 19:13:57.460 │
│ features      drwxrwxr-x   louis  louis  4096   2025-02-03 16:05:02.225 │
│ pols.py       -rw-rw-r--   louis  louis  20138  2025-02-03 15:54:40.182 │
│ resegment.py  -rw-rw-r--   louis  louis  651    2025-02-03 03:19:02.034 │
│ walk.py       -rw-rw-r--   louis  louis  6543   2025-02-03 12:43:07.437 │
└──────────────┴─────────────┴───────┴───────┴───────┴─────────────────────────┘

I personally prefer the second one and expect it is more in line with expectations anyway, so I'm leaving it that way.

Extra features

as_path

The as_path parameter (--as-path flag) gives the result back as a pathlib.Path type, Polars object dtype column 'path', instead of the name str type, Polars string dtype column 'name'. Obviously this makes no difference on the command line!

$ pols
shape: (1, 1)
┌──────┐
│ name │
│ ---  │
│ str  │
╞══════╡
│ pols │
└──────┘
$ pols --as-path
shape: (1, 1)
┌────────┐
│ path   │
│ ---    │
│ object │
╞════════╡
│ pols   │
└────────┘

drop_override and keep_override

As well as the ls -l style interface, the drop_override parameter (--drop-override in the CLI) will allow you to specify columns to keep, for more control and for ease of debugging.

These are flags to include/exclude computed columns from being dropped. Typically, we don't discard columns when we compute them, but the underlying goal of this tool is to imitate ls, so we must. To see all the information pols collects, set drop_override to "" (i.e. the empty list as a comma-separated string).

$ pols
.:
shape: (6, 1)
┌────────────────┐
│ name           │
│ ---            │
│ str            │
╞════════════════╡
│ dist           │
│ pyproject.toml │
│ README.md      │
│ src            │
│ tests          │
│ uv.lock        │
└────────────────┘
$ pols --drop-override ''
.:
shape: (6, 5)
┌────────────────┬────────────────┬────────┬────────┬────────────┐
│ path            name            rel_to  is_dir  is_symlink │
│ ---             ---             ---     ---     ---        │
│ object          str             object  bool    bool       │
╞════════════════╪════════════════╪════════╪════════╪════════════╡
│ dist            dist            .       true    false      │
│ pyproject.toml  pyproject.toml  .       false   false      │
│ README.md       README.md       .       false   false      │
│ src             src             .       true    false      │
│ tests           tests           .       true    false      │
│ uv.lock         uv.lock         .       false   false      │
└────────────────┴────────────────┴────────┴────────┴────────────┘
$ pols --t --drop-override ''
.:
shape: (6, 6)
┌────────────────┬────────────────┬────────┬────────┬────────────┬─────────────────────────┐
│ path            name            rel_to  is_dir  is_symlink  time                    │
│ ---             ---             ---     ---     ---         ---                     │
│ object          str             object  bool    bool        datetime[ms]            │
╞════════════════╪════════════════╪════════╪════════╪════════════╪═════════════════════════╡
│ README.md       README.md       .       false   false       2025-02-03 14:30:19.458 │
│ dist            dist            .       true    false       2025-02-03 14:13:09.173 │
│ pyproject.toml  pyproject.toml  .       false   false       2025-02-03 14:12:54.917 │
│ uv.lock         uv.lock         .       false   false       2025-02-02 12:33:52.007 │
│ src             src             .       true    false       2025-02-01 19:13:57.460 │
│ tests           tests           .       true    false       2025-02-01 19:13:57.460 │
└────────────────┴────────────────┴────────┴────────┴────────────┴─────────────────────────┘

Naturally there is also keep_override parameter (--keep-override flag) (which will prevent the named columns from being dropped).

$ pols --t
.:
shape: (6, 1)
┌────────────────┐
│ name           │
│ ---            │
│ str            │
╞════════════════╡
│ README.md      │
│ dist           │
│ pyproject.toml │
│ uv.lock        │
│ src            │
│ tests          │
└────────────────┘
$ pols --t --keep-override path
.:
shape: (6, 2)
┌────────────────┬────────────────┐
│ path            name           │
│ ---             ---            │
│ object          str            │
╞════════════════╪════════════════╡
│ README.md       README.md      │
│ dist            dist           │
│ pyproject.toml  pyproject.toml │
│ uv.lock         uv.lock        │
│ src             src            │
│ tests           tests          │
└────────────────┴────────────────┘
$ pols --t --keep-override time
.:
shape: (6, 2)
┌────────────────┬─────────────────────────┐
│ name            time                    │
│ ---             ---                     │
│ str             datetime[ms]            │
╞════════════════╪═════════════════════════╡
│ README.md       2025-02-03 14:38:01.979 │
│ dist            2025-02-03 14:13:09.173 │
│ pyproject.toml  2025-02-03 14:12:54.917 │
│ uv.lock         2025-02-02 12:33:52.007 │
│ src             2025-02-01 19:13:57.460 │
│ tests           2025-02-01 19:13:57.460 │
└────────────────┴─────────────────────────┘
$ pols --t --keep-override 'path,time'
.:
shape: (6, 3)
┌────────────────┬────────────────┬─────────────────────────┐
│ path            name            time                    │
│ ---             ---             ---                     │
│ object          str             datetime[ms]            │
╞════════════════╪════════════════╪═════════════════════════╡
│ README.md       README.md       2025-02-03 14:38:01.979 │
│ dist            dist            2025-02-03 14:13:09.173 │
│ pyproject.toml  pyproject.toml  2025-02-03 14:12:54.917 │
│ uv.lock         uv.lock         2025-02-02 12:33:52.007 │
│ src             src             2025-02-01 19:13:57.460 │
│ tests           tests           2025-02-01 19:13:57.460 │
└────────────────┴────────────────┴─────────────────────────┘

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

polars_ls-0.2.7.tar.gz (18.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

polars_ls-0.2.7-py3-none-any.whl (16.6 kB view details)

Uploaded Python 3

File details

Details for the file polars_ls-0.2.7.tar.gz.

File metadata

  • Download URL: polars_ls-0.2.7.tar.gz
  • Upload date:
  • Size: 18.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.22.3 CPython/3.12.8 Linux/6.8.0-51-generic

File hashes

Hashes for polars_ls-0.2.7.tar.gz
Algorithm Hash digest
SHA256 83f8275d1be146fcb146a405c802f8173916899bd89cb5846a233f4309e183f7
MD5 fe233be408006a5a6a66480e85c65cd2
BLAKE2b-256 08178fb15650f227ad261b073e4aeec42cdb47e87c80a0478e7d867c08985f9b

See more details on using hashes here.

File details

Details for the file polars_ls-0.2.7-py3-none-any.whl.

File metadata

  • Download URL: polars_ls-0.2.7-py3-none-any.whl
  • Upload date:
  • Size: 16.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: pdm/2.22.3 CPython/3.12.8 Linux/6.8.0-51-generic

File hashes

Hashes for polars_ls-0.2.7-py3-none-any.whl
Algorithm Hash digest
SHA256 a6781b1d109c1def21ae4ca7d47b30bb84a4a7ca369c367e1b3e7ca3b4616465
MD5 b4097577d52f4dccb1ae5bcbc83f1faf
BLAKE2b-256 b512e3e996796b77d0f8ab682043da4ab79b1ad3c37593cbe0113e6e360d235d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page