I have a dataset containing 100,000 rows of online customer transactions for 1 year. The columns contain: product ID, product category, no. of sales, date & time of purchase and region of purchase.
There are a total of 1000 products. I was thinking of doing a monthly sales forecast for each product. However, if I do that, I will have 12000 rows (1000 products x 12 months) with ~1000+ one-hot-encoded features, so, I am scared of overfitting. Also, the fact that I have only 1 year worth of data is gonna be a problem. So, what kind of problem would be more suitable for this dataset?
submitted by /u/Mammoth_Network_6236 to r/learnmachinelearning
[link] [comments]
Laisser un commentaire