BBM467 - Data Intensive Applications (Spring 2020)
Lecturer: Dr. Fuat Akal
Lectures: Wednesdays 09:00-12:00 @D8
Practicum (BBM469): Fridays 14:00-16:00 @D8
Assistants: Alaettin Uçan,
Office Hours: Open door policy
Announcements will be made through Piazza Page only. See very bottom of this page.
The objective of this course is to teach students fundamentals of big data management and analytics. They will gain experience with some key technologies, platforms, tools and systems used by big data scientists and engineers. Key words: Big Data, Data Science, Distributed Computing, Cluster Computing, Scalable Machine Learning, Cloud Computing and Virtualization, Data and Ethics.
Prior to attending this course, participants MUST have basic understanding of computer systems, databases, distributed computing and machine learning. Python programming knowledge would be useful.
Lectures will be conducted in English in the classroom. Course materials will be in English. Attendance is mandatory and will be rewarded while grading. Beware that this course requires hard work.
Lab Work (BBM469):
Although BBM469 is an independent lecture, it goes hand in hand with BBM467. It is strongly recommended that you enroll in BBM469 only if you are enrolled in BBM467. There will be three assignments (0% + 25% + 25%) and a Data Science Capstone Project, DSCP (50%). Students may work alone or in groups of maximum two. Assignments and projects must be delivered within deadlines. Late deliveries will be penalized by 10 points per day for at most three days.
The lecturer strongly believes that any student can pass the class with a good grade as long as she/he tries enough. There will be one written midterm examination (40%) and one final examination (60%).
Attendance is NOT mandatory due to pandemic.
The schedule is tentative for the moment.
|Week #||Date||Title||Slides||Reading||BBM469 Lab WorkDeadlines are on Fridays unless stated explicitly|
|1||26.02||What is Data Science? Intoduction to Big Data?|
|2||04.03||Data Science Methodology|
|3||11.03||Python for Data Science||Lab Session: Open Source tools for Data ScienceAssignment 1 (out): Python Exercises (Take home, no delivery)|
|4||18.03||No Lecture due to Corona Break|
|5||25.03||No Lecture due to Corona Break|
|6||01.04||Data Analysis with Python||Quiz (Coverage: Python for Data Science and Assignment 1) Deadline for Building DSCP Groups|
|7||08.04||Data Visualization with Python|
|8||15.04||Machine Learning with Python||Submission of DSCP Proposals Deadline: 22.04, Midnight Assignment 2 (out): Clustering and Classification with Python Deadline: 01.05, Midnight|
|9||22.04||Foundations for Big Data Systems|
|10||29.04||Scalable Machine Learning with Spark||Assignment 3 (out): Machine Learning with SparkDeadline: 15.05, Midnight|
|13||20.05||NOSQL||DSCP Final DeliveriesDeadline: 27.05, Midnight|
|14||27.05||Data and Ethics|
I do not follow a specific text book but, here are few books I can refer to.
The course webpage will be kept up-to-date throughout the semester. All course related communications will be carried out through Piazza.
Anonymous Feedback Forms:
Please use Fuat's Anonymous Feedback Form if you have something to tell me in private while staying anonymous. Do not forget that this form are not to inform on your friends!