1. Objective: A. We want to detect outliers in the quantity of purchases at a supermarket. B. We want to detect outliers in the whole transaction at a supermarket. 2. License: Free to use but requires citation of the following paper: Pennacchioli, D., Coscia, M., Rinzivillo, S., Pedreschi, D. and Giannotti, F., ‘Explaining the Product Range Effect in Purchase Data’. In BigData, 2013. 3. Data Source: http://www.michelecoscia.com/?page_id=379 4. DataSet Info: This is a dataset obtained from one of the largest Italian retail distribution company named ‘Coop’. The original dataset contains around ~25 million purchase records from January 2007 to December 2011. We merged three three separate files that comes with the original dataset and include only the first 100000 purchases. 5. Field Meanings: A. customer_id: Unique customer ID. B. shop_id: Unique shop ID. C. product_id: Unique product ID. D. quantity: Quantity in which the product was purchased. E. price: Product price. F. distance: Distance between the customer’s house and the shop location in meters. G. probable_cause: Field that has most influence for making a outlying transaction. H. isOutlier: 1(Outlier)/0(Normal) 6. Parameter Selection: A. Dashboard Usage: Detect Numerical Outlier Settings: 1)Search command: | inputlookup supermarket.csv | head 1000 2)Field to analyze: quantity 3)Threshold method: Standard Deviation 4)Threshold multiplier: 5 5)Sliding window: N/A B. Dashboard Usage: Detect Categorical Outlier Settings: 1)Search command: | inputlookup supermarket.csv 2)Field(s) to analyze: customer_id, shop_id, product_id, quantity, price, distance