subject

This assignment is to load data from CSV, populate them in a SQLite database, and run in-database analytics with SQL. Data Source
The data is given in the CSV format, available at: (link to .csv file removed but this is a sample:)
Only 11 rows of a 30k row csv
ID LIMIT_BAL SEX EDUCATION MARRIAGE AGE PAY_0 PAY_2 PAY_3 PAY_4 PAY_5 PAY_6 BILL_AMT1 BILL_AMT2 BILL_AMT3 BILL_AMT4 BILL_AMT5 BILL_AMT6 PAY_AMT1 PAY_AMT2 PAY_AMT3 PAY_AMT4 PAY_AMT5 PAY_AMT6 default. payment. next. month
1 20000 2 2 1 24 2 2 -1 -1 -2 -2 3913 3102 689 0 0 0 0 689 0 0 0 0 1
2 120000 2 2 2 26 -1 2 0 0 0 2 2682 1725 2682 3272 3455 3261 0 1000 1000 1000 0 2000 1
3 90000 2 2 2 34 0 0 0 0 0 0 29239 14027 13559 14331 14948 15549 1518 1500 1000 1000 1000 5000 0
4 50000 2 2 3 37 0 0 0 0 0 0 46990 48233 49291 28314 28959 29547 2000 2019 1200 1100 1069 1000 0
5 50000 1 2 1 57 -1 0 -1 0 0 0 -8617 5670 35835 20940 19146 19131 2000 36681 10000 9000 689 679 0
According to the site:
This dataset contains information on default payments, demographic factors, credit data, history of payment, and bill statements of credit card clients in Taiwan from April 2005 to September 2005.
Write Python code with the SQLite module to do the following.
SQLite Database Setup
Create a new database, connect to it, and create a table structure corresponding to data fields in the CSV data source.
Load Data into SQLite
Read data from the CSV file, use a loop to read each CSV line (data instance), and insert it into the SQLite table.
In-database Query and Analytics with SQLite Data
Code in Python with related SQL statements to do the following:
Update the data so that marriage=2 (single) and marriage=3 (others) are merged into 2 (single).
Remove all data records with negative BILL_AMT values (in any of the BILL_AMT1 through BILL_AMT6;
Select and show the first 10 records in the database table, using SELECT … LIMIT… ;
Select and show all records with a BILL_AMT1 amount greater than 500k;
Compute the total number of records, average AGE, min LIMIT_BAL, max LIMIT_BAL in the data;
Count the # records, average AGE, min LIMIT_BAL, max LIMIT_BAL for default. payment. next. month=0 (no default) vs. default. payment. next. month=1 (default), using GROUP BY;
Count the # records, average AGE, min LIMIT_BAL, max LIMIT_BAL for each marriage group (1, 2), again using GROUP BY;
Count the # records in each marriage group who will default (1) vs. not default (0).
Next step would be using Python with MongoDB:
To write Python code to load the same as above into MongoDB and conduct the same analysis above. -

ansver
Answers: 2

Another question on Computers and Technology

question
Computers and Technology, 22.06.2019 03:50
You are working as a security analyst in a company xyz that owns the whole subnet range of 23.0.0.0/8 and 192.168.0.0/8. while monitoring the data, you find a high number of outbound connections. you see that ip's owned by xyz (internal) and private ip's are communicating to a single public ip. therefore, the internal ip's are sending data to the public ip. after further analysis, you find out that this public ip is a blacklisted ip, and the internal communicating devices are compromised. what kind of attack does the above scenario depict?
Answers: 3
question
Computers and Technology, 23.06.2019 15:30
Hey so i was just trying out some game hacks so i took a paste from online and built it in my visual studio and then suddenly my computer was working or clicking on stuff on its own am i hacked?
Answers: 1
question
Computers and Technology, 23.06.2019 22:30
Jamie has to enter the names, employee id’s, and income of a group of employees into a worksheet. which option will jamie use to describe the data
Answers: 3
question
Computers and Technology, 24.06.2019 18:30
These factors limit the ability to attach files to e-mail messages. location of sender recipient's ability to open file size of file type of operating system used
Answers: 1
You know the right answer?
This assignment is to load data from CSV, populate them in a SQLite database, and run in-database an...
Questions
question
Mathematics, 02.09.2021 01:00
question
Mathematics, 02.09.2021 01:00
question
Mathematics, 02.09.2021 01:00
question
English, 02.09.2021 01:00
question
Chemistry, 02.09.2021 01:00
question
Mathematics, 02.09.2021 01:00
question
Mathematics, 02.09.2021 01:00
question
Mathematics, 02.09.2021 01:00
Questions on the website: 13722359