List and count words from a column

All we need is an easy explanation of the problem, so here it is.

I have a column with strings containing a list of species:

+----------------------------------------+
|                species                 |
+----------------------------------------+
| Dinosauria, Ornitischia, indeterminado |
| Sirenia                                |
| Dinosauria, Therophoda                 |
| Dinosauria, Therophoda, Allosaurus     |
| and so on...                           |
+----------------------------------------+

I am looking for a way, in PostgreSQL 12, to list and count all the unique names such as:

+---------------+-------+
|    species    | count |    
+---------------+-------+
| Dinossauria   | 3     |
| Ornitischia   | 1     |
| indeterminado | 1     |                        
| Sirenia       | 1     |    
| Theropoda     | 2     |              
| Allosaurus    | 1     |                 
+-----------------------+

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

You can split the comma separate list into rows using regexp_split_to_table() and the group by that value:

select s.species, count(*)
from the_table t
  cross join regexp_split_to_table(t.species, '\s*,\s*') as s(species)
group by s.species 

I am using a regex as the delimiter to get rid of the whitespace after the comma. The above would also be possible with unnest(string_to_array(t.species, ',')) but then you need to trim() the values to get rid of the whitespace.

Online example

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply