A collection of python functions for somebody's sanity
Project description
foc
fun oriented code
or francis' odd collection
.
Functions from the Python
standard library are great. But some notations are a bit painful and confusing for personal use, so I created this odd collection of functions.
Tl;dr
foc
provides a collection of higher-order functions and some (pure) helpful functionsfoc
respects thePython
standard library. Never reinvented the wheel.
How to use
# install
$ pip install -U foc
# import
>>> from foc import *
# Take a look at the examples below
To list all available functions, call
flist()
.
Ground rules
- Followed
Haskell
-like function names and arguments order - Considered using generators first if possible. (lazy-evaluation)
map
,filter
,zip
,range
,flat
...
- Provide the functions that unpack generators in
list
as well. (annoying to unpack with[*]
orlist
every time) - Function names that end in
l
indicate the result will be unpacked in a list.
mapl
,filterl
,zipl
,rangel
,flatl
,takewhilel
,dropwhilel
, ...
- Function names that end in
_
indicate that the function is a partial application (not-fully-evaluated function) builder.
f_
,ff_
,c_
,cc_
,m_
,v_
,u_
, ...
- Most function implementations should be less than 5-lines.
- No dependencies except for the
Python
standard library - No unnessary wrapping objects.
Examples
Note: foc
's functions are valid for any iterable such as list
, tuple
, deque
, set
, str
, ...
>>> id("francis")
'francis'
>>> fst(["sofia", "maria", "claire"])
'sofia'
>>> snd(("sofia", "maria", "claire"))
'maria'
>>> nth(3, ["sofia", "maria", "claire"]) # not list index, but literally n-th
'claire'
>>> take(3, range(5, 10))
[5, 6, 7]
>>> list(drop(3, "github")) # `drop` returns a generator
['h', 'u', 'b']
>>> head(range(1,5)) # range(1, 5) = [1, 2, 3, 4]
1
>>> last(range(1,5))
4
>>> list(init(range(1,5))) # `init` returns a generator
[1, 2, 3]
>>> list(tail(range(1,5))) # `tail` returns a generator
[2, 3, 4]
>>> pred(3)
2
>>> succ(3)
4
>>> odd(3)
True
>>> even(3)
False
>>> null([]) == null(()) == null({}) == null("")
True
>>> elem(5, range(10))
True
>>> words("fun on functions")
['fun', 'on', 'functions']
>>> unwords(['fun', 'on', 'functions'])
'fun on functions'
>>> lines("fun\non\nfunctions")
['fun', 'on', 'functions']
>>> unlines(['fun', 'on', 'functions'])
"fun\non\nfunctions"
>>> take(3, repeat(5)) # repeat(5) = [5, 5, ...]
[5, 5, 5]
>>> take(5, cycle("fun")) # cycle("fun") = ['f', 'u', 'n', 'f', 'u', 'n', ...]
['f', 'u', 'n', 'f', 'u']
>>> replicate(3, 5) # the same as 'take(3, repeat(5))'
[5, 5, 5]
>>> take(3, count(2)) # count(2) = [2, 3, 4, 5, ...]
[2, 3, 4]
>>> take(3, count(2, 3)) # count(2, 3) = [2, 5, 8, 11, ...]
[2, 5, 8]
Get binary functions from python
operators: sym
sym(OP)
converts python
's symbolic operators into binary functions.
The string forms of operators like +
, -
, /
, *
, **
, ==
, !=
, .. represent the corresponding binary functions.
To list all available symbols, call
sym()
.
>>> sym("+")(5, 2) # 5 + 2
7
>>> sym("==")("sofia", "maria") # "sofia" == "maria"
False
>>> sym("%")(123456, 83) # 123456 % 83
35
Build partial application: f_
and ff_
f_
build left-associative partial application,
where the given function's arguments partially evaluation from the left.ff_
build right-associative partial application,
where the given function's arguments partially evaluation from the right.
f_(fn, *args, **kwargs)
ff_(fn, *args, **kwargs) == f_(flip(fn), *args, **kwargs)
>>> f_("+", 5)(2) # the same as `(5+) 2` in Haskell
7 # 5 + 2
>>> ff_("+", 5)(2) # the same as `(+5) 2 in Haskell`
7 # 2 + 5
>>> f_("-", 5)(2) # the same as `(5-) 2`
3 # 5 - 2
>>> ff_("-", 5)(2) # the same as `(subtract 5) 2`
-3 # 2 - 5
# with N-ary function
>>> def print_args(a, b, c, d): print(f"{a}-{b}-{c}-{d}")
>>> f_(print_args, 1, 2)(3, 4) # partial-eval from the left
1-2-3-4 # print_args(1, 2, 3, 4)
>>> f_(print_args, 1, 2, 3)(4) # patial-eval with different args number
1-2-3-4 # print_args(1, 2, 3, 4)
>>> ff_(print_args, 1, 2)(3, 4) # partial-eval from the right
4-3-2-1 # print_args(4, 3, 2, 1)
Build curried functions: c_
and cc_
When currying a given function,
c_
takes the function's arguments from the left- while
cc_
takes them from the right.
c_(fn) == curry(fn)
cc_(fn) == c_(flip(fn))
See also uncurry
# currying from the left args
>>> c_("+")(5)(2) # 5 + 2
7
>>> c_("-")(5)(2) # 5 - 2
3
# currying from the right args
>>> cc_("+")(5)(2) # 2 + 5
7
>>> cc_("-")(5)(2) # 2 - 5
-3
# with N-ary function
>>> c_(print_args)(1)(2)(3)(4) # print_args(1, 2, 3, 4)
1-2-3-4
>>> cc_(print_args)(1)(2)(3)(4) # print_args(4, 3, 2, 1)
4-3-2-1
Build composition of functions: cf_
and cfd
cf_
(composition of function) composes functions using the given list of functions.cfd
(composing-function decorator) decorates a function with the given list of functions.
cf_(*fn, rep=None)
cfd(*fn, rep=None)
>>> square = ff_("**", 2) # the same as (^2) in Haskell
>>> add5 = ff_("+", 5) # the same as (+5) in Haskell
>>> mul7 = ff_("*", 7) # the same as (*7) in Haskell
>>> cf_(mul7, add5, square)(3) # (*7) . (+5) . (^2) $ 3
98 # mul7(add5(square(3))) = ((3 ^ 2) + 5) * 7
>>> cf_(square, rep=3)(2) # cf_(square, square, square)(2) == ((2 ^ 2) ^ 2) ^ 2 = 256
256
>>> @cfd(mul7, add5, square)
... def even_num_less_than(x):
... return len(list(filter(even, range(x))))
>>> even_num_less_than(7) # 'even numbers less than 7' = len({0, 2, 4, 6}) = 4
147 # mul7(add5(square(4))) = ((4 ^ 2) + 5) * 7 = 147
# the meaning of decorating a function with a composition of functions
g = cfd(a, b, c, d)(f) # g = (a . b . c . d)(f)
# the same
cfd(a, b, c, d)(f)(x) # g(x) = a(b(c(d(f(x)))))
cf_(a, b, c, d, f)(x) # (a . b . c . d . f)(x) = a(b(c(d(f(x))))) = g(x)
cfd
is very handy and useful to recreate previously defined functions by composing functions. All you need is to write a basic functions to do fundamental things.
Partial application of map
: m_
and mm_
m_
builds partial application ofmap
(left-associative)mm_
builds partial application from right to left (right-associative).
Compared to
Haskell
,
f <$> xs == map(f, xs)
(f <$>) == f_(map, f) == m_(f)
(<$> xs) == f_(flip(map), xs) == mm_(xs)
Unpacking with list(..)
or [* .. ]
is sometimes very annoying. Use mapl
for low memory consuming tasks instead.
# mapl(f, xs) == [* map(f, xs)] == list(map(f, xs))
>>> mapl = cfd(list)(map)
# so 'm_' and 'mm_' do
>>> ml_ = cfd(list)(m_)
>>> mml_ = cfd(list)(mm_)
# The same as [ (lambda x: 8*x)(x) for x in range(1, 6) ]
>>> list(map(f_("*", 8), range(1, 6))) # (8*) <$> [1..5]
[8, 16, 24, 32, 40]
# tha same: shorter using 'mapl'
>>> mapl(f_("*", 8), range(1, 6)) # (8*) <$> [1..5]
[8, 16, 24, 32, 40]
# the same: partial application (from left)
>>> ml_(f_("*", 8))(range(1, 6)) # ((8*) <$>) [1..5]
[8, 16, 24, 32, 40]
# the same: partial application (from right)
>>> mml_(range(1, 6))(f_("*", 8)) # (<$> [1..5]) (8*)
[8, 16, 24, 32, 40]
Partial application of filter
: v_
and vv_
v_
builds partial application offilter
(left-associative)vv_
builds partial application from right to left (right-associative).
The same as map
(mapping functions over iterables) except for filtering iterables using predicate function.
The name of
v_
comes from the shape of 'funnel'.
# filterl(f, xs) == [* filter(f, xs)] == list(filter(f, xs))
>>> filterl = cfd(list)(filter)
>>> vl_ = cfd(list)(v_) # v_ = f_(filter, f)
>>> vvl_ = cfd(list)(vv_) # vv_ = ff_(filter, xs)
# generate a filter to select only even numbers
>>> even_nums = vl_(even)
>>> even_nums(range(10))
[0, 2, 4, 6, 8]
>>> even_nums({2, 3, 5, 7, 11, 13, 17})
[2]
# partailly evaluated 'filter' using 'prime numbers less than 20'
>>> primes_lt_20 = vvl_([2, 3, 5, 7, 11, 13, 17, 19])
# filter out numbers LE 10
>>> primes_lt_20(ff_(">", 10)) # (> 10)
[11, 13, 17, 19]
# used a lambda function
>>> primes_lt_20(lambda x: x % 3 == 2)
[2, 5, 11, 17, 23, 29, 41, 47]
# used the composition of functions
>>> primes_lt_20(cf_(ff_("==", 2), ff_("%", 3))) # ((== 2) . (% 3))
[2, 5, 11, 17, 23, 29, 41, 47]
Other higher-order functions
>>> flip(pow)(7, 3) # the same as `pow(3, 7) = 3 ** 7`
2187
>>> bimap(f_("+", 3), f_("*", 7), (5, 7)) # bimap (3+) (7*) (5, 7)
(8, 49) # (3+5, 7*7)
>>> first(f_("+", 3), (5, 7)) # first (3+) (5, 7)
(8, 7) # (3+5, 7)
>>> second(f_("*", 7), (5, 7)) # second (7*) (5, 7)
(5, 49) # (5, 7*7)
>>> take(5, iterate(lambda x: x**2, 2)) # [2, 2**2, (2**2)**2, ((2**2)**2)**2, ...]
[2, 4, 16, 256, 65536]
>>> [* takewhile(even, [2, 4, 6, 1, 3, 5]) ] # `takewhile` returns a generator
[2, 4, 6]
>>> takewhilel(even, [2, 4, 6, 1, 3, 5])
[2, 4, 6]
>>> [* dropwhile(even, [2, 4, 6, 1, 3, 5]) ] # `dropwhile` returns a generator
[1, 3, 5]
>>> dropwhilel(even, [2, 4, 6, 1, 3, 5])
[1, 3, 5]
# fold with a given initial value from the left
>>> foldl("-", 10, range(1, 5)) # foldl (-) 10 [1..4]
0
# fold with a given initial value from the right
>>> foldr("-", 10, range(1, 5)) # foldr (-) 10 [1..4]
8
# `foldl` without an initial value (used first item instead)
>>> foldl1("-", range(1, 5)) # foldl1 (-) [1..4]
-8
# `foldr` without an initial value (used first item instead)
>>> foldr1("-", range(1, 5)) # foldr1 (-) [1..4]
-2
# accumulate reduced values from the left
>>> scanl("-", 10, range(1, 5)) # scanl (-) 10 [1..4]
[10, 9, 7, 4, 0]
# accumulate reduced values from the right
>>> scanr("-", 10, range(1, 5)) # scanr (-) 10 [1..4]
[8, -7, 9, -6, 10]
# `scanl` but no starting value
>>> scanl1("-", range(1, 5)) # scanl1 (-) [1..4]
[1, -1, -4, -8]
# `scanr` but no starting value
>>> scanr1("-", range(1, 5)) # scanr1 (-) [1..4]
[-2, 3, -1, 4]
# See also 'concat' that returns a generator
>>> concatl(["sofia", "maria"])
['s', 'o', 'f', 'i', 'a', 'm', 'a', 'r', 'i', 'a']
# Note that ["sofia", "maria"] = [['s','o','f','i','a'], ['m','a','r','i','a']]
# See also 'concatmap' that returns a generator
>>> concatmapl(str.upper, ["sofia", "maria"]) # concatmapl = cfd(list, concat)(map)
['S', 'O', 'F', 'I', 'A', 'M', 'A', 'R', 'I', 'A']
Lazy Evaluation: lazy
and force
lazy
defers the evaluation of a function(or expression) and returns the deferred expression.force
forces the deferred-expression to be fully evaluated when needed. it remindsHaskell
'sforce x = deepseq x x
.
lazy(function-name, *args, **kwargs)
force(expr)
mforce([expr])
# strictly generate a random integer between [1, 10)
>>> randint(1, 10)
# generate a lazy expression for the above
>>> deferred = lazy(randint, 1, 10)
# evaluate it when it need
>>> force(deferred)
# the same as above
>>> deferred()
Are those evaluations with lazy
really deferred?
>>> long_list = randint(1, 100000, 100000) # a list of one million random integers
>>> %timeit sort(long_list)
142 ms ± 245 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
# See the evaluation was deferred
>>> %timeit lazy(sort, long_list)
1.03 µs ± 2.68 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each
Example
For given a function randint(low, high)
, how can we generate a list of random integers?
[ randint(1, 10) for _ in range(5) ] # exactly the same as 'randint(1, 10, 5)'
It's the simplest way but what about using replicate
?
# generate a list of random integers using 'replicate'
>>> replicate(5, randint(1, 10))
[7, 7, 7, 7, 7] # ouch, duplication of the first evaluated item.
Wrong! This result is definitely not what we want. We need to defer the function evaluation till it is replicated.
Just use lazy(randint, 1, 10)
instead of randint(1, 10)
# replicate 'deferred expression'
>>> randos = replicate(5, lazy(randint, 1, 10))
# evaluate when needed
>>> mforce(randos) # mforce = ml_(force), map 'force' over deferred expressions
[6, 2, 5, 1, 9] # exactly what we wanted
Here is the simple secret: if you complete f_
or ff_
with a function name and its arguments, and leave it unevaluated (not called), they will act as a deferred expression.
Not related to lazy
operation, but you do the same thing with uncurry
# replicate the tuple of arguments (1, 10) and then apply to uncurried function
>>> ml_(u_(randint))(replicate(5, (1,10))) # u_ == uncurry
[7, 6, 1, 7, 2]
Normalize containers: flat
flat
flattens all kinds of iterables except for string-like object (str
, bytes
).
flat(*args)
# Assume that we regenerate 'data' every time in the examples below
>>> data = [1,2,[3,4,[[[5],6],7,{8},((9),10)],range(11,13)], (x for x in [13,14,15])]
# 'flat' returns a generator. flatl = cfd(list)(flat)
>>> flatl(data) # list
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
>>> flatt(data) # tuple
(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15)
>>> flats(data) # set
{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}
>>> flatd(data) # deque
deque([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
# regardless of the number of arguments
>>> flatl(1,[2,{3}],[[[[[4]],5]]], "sofia", "maria")
[1, 2, 3, 4, 5, 'sofia', 'maria']
Handy File Tools: ls
and grep
Use ls
and grep
in the same way you use in your terminal every day.
This is just a more intuitive alternative to
os.listdir
andos.walk
.
When applicable, try using the more flexibleshell("ls -a1 <path>")
orshell("find <path>")
instead.
See also: shell
Background
Path
from pathlib
and glob
are great and useful. But,
- Not intuitive:
os.path.expanduser("~")
every time? - Non-automated filepath normalization
- No flexible understanding: not tolerable for
foc//__init__.py
(/
typo) - Not integrated: listing (
os.listdir
), globbing (glob.glob
) and selecting files (filter
)
Usage
ls(*paths, grep=REGEX, i=BOOL, r=BOOL, f=BOOL, d=BOOL, g=BOOL)
- support glob patterns
(*,?,[)
in*paths
- if given
grep=REGEX
, it behaves likels -a1 *paths | grep REGEX
- if
i
is set, it makesgrep
case-insensitive (-i
flag ingrep
) - if
r
is set, it behaves likefind -s *paths
(-R
flag inls
) - if
f
is set, it lists only files likefind -s *paths -type f
- if
d
is set, it lists only directories likefind -s *paths -type d
- if
g
is set, it returns a generator instead of a sorted list
# couldn't be simpler!
>>> ls() # the same as ls("."): get contents of the curruent dir
# expands "~" automatically
>>> ls("~") # the same as `ls -a1 ~`: returns a list of $HOME
# support glob patterns (*, ?, [)
>>> ls("./*/*.py")
# with multiple filepaths
>>> ls(FILE, DIR, ...)
# list up recursively and filter hidden files out
>>> ls(".git", r=True, grep="^[^\.]")
# only files in '.git' directory
>>> ls(".git", r=True, f=True)
# only directories in '.git' directory
>>> ls(".git", r=True, d=True)
# search recursivley and matching a pattern with `grep`
>>> ls(".", r=True, i=True, grep=".Py") # 'i=True' for case-insensitive grep pattern
[ ..
'.pytest_cache/v/cache/stepwise',
'foc/__init__.py',
'foc/__pycache__/__init__.cpython-310.pyc',
'tests/__init__.py',
.. ]
# regex patterns come in
>>> ls(".", r=True, grep=".py$")
['foc/__init__.py', 'setup.py', 'tests/__init__.py', 'tests/test_foc.py']
# that's it!
>>> ls(".", r=True, grep="^(foc).*py$")
# the same as above
>>> ls("foc/*.py")
['foc/__init__.py']
grep
build a filter to select items matching REGEX
pattern from iterables.
grep(REGEX, i=BOOL)
# 'grep' builds filter with regex patterns
>>> grep(r"^(foc).*py$")(ls(".", r=True))
['foc/__init__.py']
See also: HOME
, cd
, pwd
, mkdir
, rmdir
, exists
, dirname
, and basename
.
Neatify data structures: neatly
and nprint
neatly
generates neatly formatted string of the complex data structures of dict
and list
.
nprint
(neatly-print) prints data structures to stdout
using neatly
formatter."""
nprint(...) = print(neatly(...))
nprint(DICT, _cols=INDENT, _width=WRAP, **kwargs)
>>> o = {
... "$id": "https://example.com/enumerated-values.schema.json",
... "$schema": "https://json-schema.org/draft/2020-12/schema",
... "title": "Enumerated Values",
... "type": "object",
... "properties": {
... "data": {
... "enum": [42, True, "hello", None, [1, 2, 3]]
... }
... }
... }
>>> nprint(o)
$id | 'https://example.com/enumerated-values.schema.json'
$schema | 'https://json-schema.org/draft/2020-12/schema'
properties | data | enum - 42
: : - True
: : - 'hello'
: : - None
: : - - 1
: : - - 2
: : - - 3
title | 'Enumerated Values'
type | 'object'
Dot-accessible dictionary: dmap
dmap
is a yet another dict
. It's exactly the same as dict
but it enables to access its nested structure with 'dot notations'.
dmap(DICT, **kwargs)
>>> d = dmap() # empty dict
>>> d = dmap(dict(...))
>>> d = dmap(name="yunchan lim", age=19, profession="pianist") # or dmap({"name":.., "age":..,})
# just put the value in the desired keypath
>>> d.cliburn.semifinal.mozart = "piano concerto no.22"
>>> d.cliburn.semifinal.liszt = "12 transcendental etudes"
>>> d.cliburn.final.beethoven = "piano concerto no.3"
>>> d.cliburn.final.rachmaninoff = "piano concerto no.3"
>>> nprint(d)
name | 'yunchan lim'
age | 19
profession | 'pianist'
cliburn | semifinal | mozart | 'piano concerto no.22'
: : liszt | '12 transcendental etudes'
: final | beethoven | 'piano concerto no.3'
: : rachmaninoff | 'piano concerto no.3'
>>> del d.cliburn.semifinal
>>> d.profession = "one-in-a-million talent"
>>> nprint(d)
name | 'yunchan lim'
age | 19
profession | 'one-in-a-million talent'
cliburn | final | beethoven | 'piano concerto no.3'
: : rachmaninoff | 'piano concerto no.3'
# No such keypath
>>> d.bach.chopin.beethoven
{}
raise and assert with expressions: error
and guard
Raise any kinds of exception in lambda
expression as well.
>>> error(MESSAGE, e=EXCEPTION_TO_RAISE) # by default, e=SystemExit
>>> error("Error, used wrong type", e=TypeError)
>>> error("out of range", e=IndexError)
>>> (lambda x: x if x is not None else error("Error, got None", e=ValueError))(None)
Likewise, use guard
if there need assertion not as a statement, but as an expression.
>>> guard(PREDICATE, MESSAGE, e=EXCEPTION_TO_RAISE) # by default, e=SystemExit
>>> guard("Almost" == "enough", "'Almost' is never 'enough'")
>>> guard(rand() > 0.5, "Assertion error occurs with a 0.5 probability")
>>> guard(len(x := range(11)) == 10, f"length is not 10: {len(x)}")
Real-World Example
A causal self-attention of the transformer
model based on pytorch
can be described as follows.
Somebody insists that this helps to follow the process flow without distraction.
def forward(self, x):
B, S, E = x.size() # size_batch, size_block (sequence length), size_embed
N, H = self.config.num_heads, E // self.config.num_heads # E == (N * H)
q, k, v = self.c_attn(x).split(self.config.size_embed, dim=2)
q = q.view(B, S, N, H).transpose(1, 2) # (B, N, S, H)
k = k.view(B, S, N, H).transpose(1, 2) # (B, N, S, H)
v = v.view(B, S, N, H).transpose(1, 2) # (B, N, S, H)
# Attention(Q, K, V)
# = softmax( Q*K^T / sqrt(d_k) ) * V
# // q*k^T: (B, N, S, H) x (B, N, H, S) -> (B, N, S, S)
# = attention-prob-matrix * V
# // prob @ v: (B, N, S, S) x (B, N, S, H) -> (B, N, S, H)
# = attention-weighted value (attention score)
return cf_(
self.dropout, # dropout of layer's output
self.c_proj, # linear projection
ff_(torch.Tensor.view, *_r(B, S, E)), # (B, S, N, H) -> (B, S, E)
torch.Tensor.contiguous, # contiguos in-memory tensor
ff_(torch.transpose, *_r(1, 2)), # (B, S, N, H)
ff_(torch.matmul, v), # (B, N, S, S) x (B, N, S, H) -> (B, N, S, H)
self.dropout_attn, # attention dropout
ff_(torch.masked_fill, *_r(mask == 0, 0.0)), # double-check masking
f_(F.softmax, dim=-1), # softmax
ff_(torch.masked_fill, *_r(mask == 0, float("-inf"))), # no-look-ahead
ff_("/", math.sqrt(k.size(-1))), # / sqrt(d_k)
ff_(torch.matmul, k.transpose(-2, -1)), # Q @ K^T -> (B, N, S, S)
)(q)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.