How to Sort by Two or More Criteria in Python

The aim of this page📝 is to explain using sorted() with multiple criteria based on the particular example of sorting JSON schema URIs by name and version.

Pavol Kutaj
2 min readOct 27, 2023

CODE

Let’s jump right into it as the following explanation will make much more sense. Say I have a list of schema endpoints in the following format: <vendor_tag>/<schema_name>/jsonschema/<major>-<minor>-<patch> and I need to sort first by <schema_name> and then by the three last items which together form a version based on semantic versioning. An example of the input is then:

all_schemas = [
'com.acme/foo/jsonschema/1-0-1',
'com.acme/example/jsonschema/1-0-0',
'com.acme/bar/jsonschema/1-0-1',
'com.acme/foo/jsonschema/1-0-0',
'com.acme/bar/jsonschema/1-0-0',
]

The function doing the sorting looks as follows:

# The following function returns
def sort(all_schemas: list):
def get_schema_name(single_schema: str) -> str:
return single_schema.split('/')[1]

def get_semver(single_schema: str) -> tuple:
return tuple(map(int, single_schema.split('/')[-1].split('-')))

return sorted(all_schemas, key=lambda single_schema: (get_schema_name(single_schema),
get_semver(single_schema)))

sorted_schemas = sort(all_schemas)

Once called, that function returns exactly what’s required

[
'com.acme/bar/jsonschema/1-0-0',
'com.acme/bar/jsonschema/1-0-1',
'com.acme/example_event/jsonschema/1-0-0'
'com.acme/foo/jsonschema/1-0-0',
'com.acme/foo/jsonschema/1-0-1',
]

EXPLAINER

  • sorted() function takes a <list> parameter and a key parameter.
  • what exactly is sorted() doing is none of our business, sorting is an essential CS subject, I am staying on a primitive “how to use” level without flirting with algorithms here.
  • keyis optional and is used for custom sorting
  • if you want to sort a list of numbers or words, you don’t really need it.
  • key takes only a single parameter that is <callable>- usually a single function
  • that single function can however wrap more than one sub-function, which is what we are after if we are to sort by more than 1 criteria
  • this function can be a lambda or a regular function
  • the lambda can return multiple values that can serve as multiple ordering criteria!
  • say you want to first extract by name which is part of a larger structure
  • and then, if there are say just 10 namespaces in 1000 rows, you want to sort by something else
  • …like a version sequence that is specific within the given namespace
  • key function is applied to each element of the input list
  • the resulting values are used to compare the elements and determine their order in the sorted list.
  • to sort by multiple criteria, the key function can return a tuple containing the values of the criteria that you want to sort by.
  • the sorted() function will then sort the list based on the values returned by the key function.

LINKS

--

--

Pavol Kutaj

Today I Learnt | Infrastructure Support Engineer at snowplow.io with a passion for cloud infrastructure/terraform/python/docs. More at https://pavol.kutaj.com