simplestatistics is a Python package that provides statistical methods for easy use.

The project is the Python implementation of simple-statistics which describes itself as “Statistical methods in readable JavaScript for browsers, servers, and people”.

My simplestatistics has a similar goal: Statistical methods in readable Python. For whatever use you have in mind.

It’s compatible with Python 2 & 3, and version 0.1.2 is available from the Python Package Index1:

pip install simplestatistics

The documentation is available online thanks to Read the Docs.

Some example usage:

>>> import simplestatistics as ss
>>> ss.mean(range(10))  # [0, 1, ..., 9]
4.5
>>> ss.median([1, 2, 5, 7, 9])
5.0
>>> ss.z_scores([-2, -1, 0, 1, 2])
[1.2649110640673518, 0.6324555320336759, 0.0, -0.6324555320336759, -1.2649110640673518]
>>> ss.t_test([1, 2, 3, 4, 5, 6], 5)
-1.9639610121239313 

I want this to be a useful package that people use, but I’m primarily working on it for selfish reasons. I’ve learned a lot of Python programming, testing, and documentation in the process, and as I add more complicated functions (Bayesian classification, regression models, distributions), I will learn more math and statistics than I did in university.

The rules help a lot with the learning:

  • Everything should be implemented in raw, organic, locally sourced Python.
  • Use libraries only if you have to and only when unrelated to the math/statistics. For example, from functools import reduce to make reduce available for those using python3. That’s okay, because it’s about making Python work and not about making the stats easier.
  • It’s okay to use operators and functions if they correspond to regular calculator buttons. For example, all calculators have a built-in square root function, so there is no need to implement that ourselves, we can use math.sqrt(). Anything beyond that, like mean, median, we have to write ourselves.

I think one of the best ways to make sure you learn something right is to know enough to make it from (relative) scratch. You have to understand and be able to explain a lot to make that happen. This is the main reason I’m enjoying working on simplestatistics.

See also:

  1. 0.1 originally. The additional 0.0.2 was me working out some kinks with registering and releasing a package on PyPI. ↩︎