return percentage per pair of items

All we need is an easy explanation of the problem, so here it is.

I’m using SQLite and I have the following table:
return percentage per pair of items

and I have 2 goals:
The first is to return the IATA code of the airline that has the most bookings in first class.
I thought of this, but it doesn’t quite work, it just returns all the airlines in no specific order:

select iata_code_airline
from booked_class_airline
group by class
having MAX(airline_class_count) 

The second goal is to provide a column that shows the percentage of the total amount of passengers that travel in each class.
I tried:

SELECT class, 
       count(class) as class_count,
       count(class) * 100.0 / (select count(*) from booked_class_airline) as class_percent
FROM booked_class_airline
group by class

But this just counts the classes, not the count of the classes. I could use the airline_class count column, but then I don’t know how to divide per class. Any advice?

How to solve :

I know you bored from this bug, So we are here to help you! Take a deep breath and look at the explanation of your problem. We have many solutions to this problem, But we recommend you to use the first method because it is tested & true method that will 100% work for you.

Method 1

To resolve your issues, I did the following (all the code below is available on the fiddle here):

(Speaking of fiddles, it would be far better if you provided us with one (DDL and DML) along with your question – it eliminates duplication of effort and gives a single source of truth – help us to help you! Also, please avoid using images for the reasons outlined in this link).

  • Q1. The first is to return the IATA code of the airline that has the most bookings in first class.

This is tricker than it might appear at first glance. A naive approach would be to do the following:

SELECT
  class, iata, MAX(cnt)
FROM flight
WHERE class = 'First'
GROUP BY class, iata
LIMIT 1;

Result:

class   iata    MAX(cnt)
First     BA           2

However, what happens when we do the following

INSERT INTO flight VALUES
('First', 'EI', 2);

and rerun our query?

Same result!!! But this is wrong! Or at least misleading, because we now have two airlines (BA and EI) with two bookings in first class! This demonstrates the perils of testing with a small (or in this case, minimal) amount of data!

Here, what we should do is this:

SELECT
  class, iata, MAX(cnt) AS c
FROM flight
WHERE class = 'First'
GROUP BY class, iata
HAVING MAX(cnt) = (
             SELECT 
               MAX(cnt) 
             FROM flight 
             WHERE class = 'First' 
           );

Result:

class   iata    c
First     BA    2
First     EI    2

Depending on your requirements, this may (and IMHO, probably is) better.

Finally, a third approach you might like to consider is using a window function (v. technical link – this intro is excellent) – maybe a bit of overkill here, but window functions are incredibly powerful for more sophisticated queries and will repay any effort spent learning them 10 times over! The one I’m using in this instance is DENSE_RANK() (from intro link).

SELECT 
  *,
  DENSE_RANK() OVER (PARTITION BY class ORDER BY cnt DESC) AS dr
FROM flight
WHERE class = 'First'
ORDER BY dr ASC;

Result:

class   iata    cnt     dr
First     BA      2     1
First     EI      2     1
First     SN      1     2

You can then wrap this is a sub-query (see fiddle) to pull out the first ranking airline(s). You may ask yourself why use this? Suppose your manager comes to you and says, I need the 3rd ranking airlines that have booked first class flights – v. tricky with "traditional" SQL – this makes it a doddle!

  • Q.2 The second goal is to provide a column that shows the percentage of the total amount of passengers that travel in each class.

I’ve left some earlier steps to arrive here in the fiddle – running this:

SELECT
 class    AS "Class",
 SUM(cnt) AS "Total",
 ROUND(SUM(cnt) * 1.0/(SELECT SUM(cnt) FROM flight) * 100) AS "Percentage"
FROM flight
GROUP BY class;

Result:

Class   Total   Percentage
Business    4           31
Economy     4           31
First       5           38

You may be wondering why have a useless multiplication by 1 in the percentage calculation? This is to avoid problems with integer division – try taking it out – all the percentages drop to 0! I found this snippet here – it’s much more elegant – although perhaps a bit more puzzling than an explicit CAST – YMMV. +1 for the question for having provided me with a learning experience!

Note: Use and implement method 1 because this method fully tested our system.
Thank you 🙂

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply