Mastering Pandas Group By for Algorithmic Trading & Quantitative Analysis 28/100 Days

Pandas Group

Pandas GroupBy is an essential function for quantitative traders, algorithmic trading developers, and data analysts. Whether you’re building crypto trading strategies or testing quantitative analysis methods for stock markets, understanding how to group and aggregate data efficiently is crucial.

Now the last question is little bit Advanced that is calculate the we weap for each sticker volume wed average price so now if you are able to realize that slowly and gradually we are going into deep Alo trading welcome back to day 28 of the 100 days of H with python algo trading in previous session we have understood the group by object in pandas and today we will continue with the same as we have seen that how we can apply aggregate method on the group by object like how we can pass a dictionary in that how we can pass a list and now we’ll continue with that and we’ll see that how we can pass the combination of both dictionary and list let’s say you have this group by object tickers here uh which we have seen yesterday so if you are new to this video please watch the previous video to understand this better right so we have the stickers Group by object and now we want to apply the aggregate method on this and let’s say that on every column I want to apply multiple functions right so what I will do here I’ll just copy and paste dictionary here and and now what we can do we can also pass a list here.

So at this place we’ll pass a list and let’s say for volume I want some then I also want mean and I also want standard deviation for the the volume so means here we are passing a dictionary first and then at the values we are passing a list so we can have multiple operations on a single column right similarly for the price also I want the max price again I want the minimum price and I want let’s say mean of the price right.

Pandas Group1
Pandas Group4

So now when I shift enter you will see that we have a data frame with volume and price column and inside that we have applied multiple functions so so on volume we applied sum mean and standard deviation and again on the price we have applied the Max minan and mean right so that’s how you can use this functionality and actually here the price is not the real that’s why you are getting a little bit different values don’t worry once you have finished with this we’ll apply all these things on the live exchange okay and now we have

Looping on groups means we can apply loops on the group by object uh how let me show you so we know that we have the stickers uh Group by object so what I will do here let’s say for group and data in tickers and let’s print the group and data so when I shift enter you can see here we have data frame here and when I check this the group is the name of the group and the data is the data frame means we can also apply loops on this group by object and we can like perform various operations let’s say I want to

Print the tickers so what I will do here I will just remove this the name of group and I’ll just keep the data and here we know that we have a ticker column there and also we can apply let’s say unique method here so it will give us only the unique values right and we know that we have only six unique values in our ticker column right now we have the split apply and command mind which we have seen in the theoretical part that how it works and let me show you again so we have the stickers object

Here Group by and then we can apply any function here right this apply is the inbuilt function of the pandas so here we can pass any function it will give you the minimum value based on the SK value on the objects and numerical value on the columns right so you can see here you can even apply Max and you will get all the maximum values so on objects it will apply the SK values and on the numerical columns it will apply the normal numerical methods let’s say if you want to apply Group by on multiple

Columns so we know that this is our original data frame and we also know that we apply Group by on only categorical data so here we know that this tricker is categorical column this exchange is also categorical column so what I’ll do here I’ll write Group by and then I’ll pass LS here so first I’ll pass ticker I want this to be grouped by with the tickers then what I’ll do I’ll write buy and sell this one it will show us that on every ticker the different Buy and Sell Groups so when I H shift enter key error tickers I guess it is

Ticker so we have to correct this and we have this uh Group by object and now we can get the size of this so this is the size you can see here that in apple we do not have cell column so that’s why it is giving only one value right we can also group bu with the exchange so let’s say I pass here another column that is exchange and now when I check it will show us that on Apple we have only buy and we did only on NYSC correct BTC we have only sell column and that was done only on binance right in future we will perform all.

These operations on the live exchanges so please focus on this and try to understand this completely okay let’s say if you want to print particular group so you know that we have these many groups here the Apple by NYSC msft by NYSC xrp cell coinbase so let’s say I want this e buy and bybit so what I will do here I’ll just copy first this object from here and then I’ll give get group and here we have to pass a tle right so what I will do here we have to pass this e and then we have to pass the byy bit

Here correct so when you hit shift enter you will see that we have a data frame with the particular group which we have mentioned in the command right now we have aggregate on multiple group Buys so for that what we can do I’ll just go here and I will copy and paste and assign to it different variable let’s say I’ll make it as ticker exchange the variable name and then I’ll run this cell right and here we’ll pass a list of functions which we want to apply on this group by object so I’ll write Max I’ll write Min and now when I

Shift enter you will see that we have a data frame with Max and Min apply on all the columns right you can see here on time stamp on price volume trade ID and it will show the maximum value and the minimum value right okay now let’s practice some real life questions which we’ll be doing on day-by-day operations so this is our original data frame right the first question is how many trades were executed for each ticker so for that what we’ll do we’ll first group by with the ticker so we’ll get a group by

Object it is asking for the number of Trades so for that we can use count we can use size but here we’ll be using the size method it will give us the total number of rows for every ticket and that means the total number of trades right so before that let me tell you that where we can apply the group by so whenever you see this each in any question you can apply Group by at that place means first we have to find each ticker and for that we can Group by with the ticker right then it is asking the

Average price means we can fetch this column from that group by object and then the average means we can apply the mean method right so I’ll just copy and paste this group by object from here and then I can fetch the price column right so I’ll write price so you will get a object with the price column and then we can just simply use the mean method and you will have the mean price for every ticker right similarly the next question is what is the total volume traded for each ticker means we have to apply the

Group by object right so first we will apply the group by own ticker and then it is asking the volume means we’ll fetch this column the volume one right and on that it is asking the total means we can use this sum method so I’ll quickly copy this group by object and then I can fetch the volume column from this and now we know that we can apply the sum function function right and we’ll have the total volume of each scker next question is that what is the price of the first trade for each ticker so means it is asking only the first

trade means the first row and similarly for each ticker means first we will Group by with the ticker so means the same group by object right then the question is the price so again we can fetch with the price column you just have to understand the question carefully you’ll get answer automatically don’t worry about that right then then the key here is that it is asking for the first trate means the first row and if you remember we have seen that how we can extract the first row we can simply apply this first

Method and you will get the first row of every ticker means the first row from the price column and you will be having the first similarly you can also use the last let me uh show you and you will get the last last price right let me correct it again and we’ll keep it first only now the next question is what is the total trading volume on each exchange for each hour of the day I’ll copy it from here and I’ll paste it here for now you can see that it is object only so what I’ll do here I’ll simply

use the inbuilt object that is two date time and here we have to just pass the column so this is the column right The Exchange dfn time stamp and now when I hit shift enter you will see that it has been changed from object to date time right and now we can use the attribute DT do R and we’ll get the results in the r if you remember we have seen previously when we filtered the data frame in quarter so we only wanted the first second and third quarter results right similarly here we can get the RS

if you want here you can even change it to month and you will get all the months right but for now we only need our so I’ll write R and now we can add a new column to our existing data frame exchange DF and I’ll write the name r and it will be the new column of our exchange DF data frame right so now when you check the exchange DF you will see that we have a new column added right the r and the question was what is the total trading volume on each exchange for each hour of the day and now it’s very simple we just

Have to group by with the volume and R so what I’ll do here I’ll write exchange DF do group by and here I can pass first the volume and then the r right and now when I shift enter we’ll get the object with the volume and R Group by and it is asking for the sum right the total means we can apply sum here simply and you will get we have to wait actually here we wanted the volume so we have to fetch that and now we’ll get the sum of every R right so now the next question is what is the total buy and sell volume for each

Ticker each ticker means we have to group by with the ticker and also we have to group by with separate buy and sell so let’s quickly do that uh we have this exchange DF and inside that we have this buy and sell right first we’ll calculate what the buy so we’ll apply filter equal to equal to buy and we’ll get a Boolean Series right and here we can again make it as a mask and we’ll get only the buy right and here now we have to calculate the volume right so I’ll Group by first with the ticker because we have

To calculate for each ticker so we’ll get a group by object and then we know that we need to calculate the volume so I will just fetch the volume and we’ll again get a group by object and now we can simply calculate this sum and you’ll get the sum of Apple Ethan Microsoft because only in these ticket we had the buy similarly for the sell what you can do you can just change it to cell and you will see that we have for BTC Google xrp the cell volume so if you want you can just print both the values or you can

Assign let’s say the buy volume and not but it’s buy and again the sell volume you can simply uh print bu volume again the sell volume and you will be having the buy and sell you just need a little bit of practice and you will be good to go right now the last question is little bit Advanced that is calculate the V weap for each sticker volume weed average price so now if you are able to realize that slowly and gradually we are going into deep Alo trading we’ll be understanding various technical terms of

Algo trading and you let me know that if you want to have a separate video where we can understand all these kinds of terminologies right so to calculate the volume Ved average price you need to know the formula to calculate the same so what we can do you can simply do a Google search so let me open the Google and let me show you so this is our Google and you simply type weap you can see that this is the formula which says some of of volume into price divide by sum of volume right so you can just copy it and paste it in

Your notebook and let’s say here right okay now we can quickly calculate this so what we will do it is the question is it is asking us the wwap for each sticker each ticker means we have to group by with the ticker so let me quickly do that exchange okay it’s marked down we have to have a code exchange exchange DF do group by with the ticker right so we’ll be having a group by object then then what we can do we can apply the Lambda function here right so so for that so we have to apply a Lambda

function here so I’ll write Lambda here I’ll write Lambda X then the formula so it is the sum of volume into price so to find first the volume we’ll write the volume and multiplied by the price so X and then price correct you can again again like check here the name of the columns so it doesn’t mess up so you can see here we have the price and the volume that is correct right so we have to find the sum of these right so I’ll just apply the sum here sum of volume into price then divide by the sum of volume

So simply sum of volume so I can write sum and now when you hit shift enter you will get the vwap volume wed average price for each sticker so let’s do that and you will see that you have the weap for each sticker it’s very simple you can apply any formula don’t worry we’ll do complex and complex things in upcoming videos for now we are just building up the base so that was it for this video I’ll keep all these excise and everything in the GitHub I’ll just push that and you can let me know if you

Have any issues and we’ll see you in the next video Until then take care bye-bye and have a nice day

Watch this Day 28 video tutorial

Day 28 :Pandas GroupBy

1. What does the GroupBy operation in Pandas allow you to do?

2. Which of the following is the correct way to group a DataFrame by a column named ‘Sector’?

3. What will the following code return: df.groupby(‘Category’)[‘Sales’].sum()?

4. How can you apply a custom function to each group in a GroupBy object?

5. What is the purpose of the .agg() method in GroupBy?

6. Which of the following statements is true about GroupBy operations?

7. How do you reset the index after a GroupBy operation?

8. Which of the following methods can be used to transform group data into another DataFrame?

9. What does the .size() method return when used on a GroupBy object?

10. Which of the following can be passed to the .agg() method?

11. What is the correct syntax to group by multiple columns?

12. How can you group data by a specific column and calculate the mean of another column?

13. What will the following code do: df.groupby(‘Category’).filter(lambda x: x[‘Sales’].mean() > 200)?

14. Which method would you use to apply multiple aggregation functions at once?

15. How can you iterate over groups in a GroupBy object?

16. What will the following code do: df.groupby(‘Category’).ngroup()?

17. Which of the following methods is used to get the first value in each group?

18. What does the .cumcount() method do in a GroupBy operation?

19. How can you use GroupBy to rank data within each group?

20. Which of the following statements best describes the use of the .apply() method with GroupBy?