Beginning on Friday, April 8 at 5 p.m. and continuing through Sunday, April 10 at 12:00 p.m., 10 student teams from Wesleyan University, Trinity College, Connecticut College, Yale University, and the University of Conn, competed in Wesleyan’s first DataFest competition. There were 52 students registered for the event, 21 of whom came from schools other than Wesleyan.
A massive data set was revealed on Friday evening and students worked all weekend to find meaning in the numbers. The students presented their findings to judges on Sunday afternoon, and some groups received prizes for their work.
This year marks the first time that Wesleyan has hosted a DataFest competition.
“DataFest was founded at UCLA in 2011, when 30 students gathered for 48 intense hours to analyze five years of arrest records provided by Lt. Thomas Zak of the Los Angeles Police Department,” the American Statistical Association’s (ASA) website reads. “ASA DataFest is now sponsored by the American Statistical Association and hosted by several of the most prestigious colleges and universities in the country.”
Director of Centers for Advanced Computing and Visiting Assistant Professor of Quantitative Analysis Manolis Kaparakis was one of the professors who wanted Wesleyan to host a DataFest competition this year. Assistant Professor of the Practice in the Quantitative Analysis Center (QAC) Valerie Nazzaro then organized the event.
“I was part of Five College DataFest last spring when I was a visiting assistant professor at Smith College,” Nazzaro wrote in an email to The Argus. “It was an exciting event and it was great to see students excited and willing to sacrifice an entire weekend to try and make sense of a complex data set.”
In preparation for the competition, the QAC hosted a series of workshops to help students prepare. Workshop themes were selected by students including, “Data Visualization in R,” “Working with data in R,” “Data Visualization—Population dot maps,” “Extending Statistical computing in R to complex data,” “Machine learning—selected applications,” “Introduction to statistical modeling,” “Introduction to Data Visualization with Tableau”, and “Taming the Monster–Tackling Complex Data.”
Many competing students were drawn to DataFest because of their love for data analysis.
“[I enjoy] solving problems and making insights out of nothing,” said Carlo Medina ’18.
At the beginning of the competition, Medina and his teammates were excited to tackle the data.
“This is a problem that we all are going to solve together; we just can’t expect it,” said Kyle Akepanidtaworn ’18. “It’s exciting.”
Medina spoke about his goals for his DataFest team.
“Our main concern right now is hopefully our code runs,” Medina said. “Hopefully we’ll find a story out of the data. I think that’s what’s important.”
After Friday evening’s data set reveal, teams worked hard to find meaning in the numbers. Anne Schwartz of Connecticut College described her team’s thought process.
“We thought about which variables were significant and what we wanted to show and then kind of just looked into what those variables showed us,” Schwartz said.
Teams took on different strategies; some teams worked autonomously, while others asked for help from the consultants that were in attendance. These consultants include Wesleyan alumni, UConn Ph.D students, faculty from all universities participating, and industry professionals from Pfizer, Boehringer Ingelheim, Union Mobile, Google, Mass Mutual, The Brattle Group, and Statistical Analysis Software (SAS). DataFest served to bring together undergraduates and professionals, providing an excellent opportunity for networking.
“It is designed to bring together current students, alumni and data analysis professionals as they work together in addressing real world problems that involve computational data work,” reads Wesleyan’s DataFest blog. “The event also provides an opportunity for recruiters to connect with students interested and skilled in data analysis that may be candidates for internships, job openings, etc.”
Associate Director and Visiting Assistant Professor of the QAC Pavel Oleinikov was one of the consultants.
“My responsibility during the event was to be a coach,” Oleinikov said. “There are two parts to making this kind of research. One is to come up with a good question, and another is to write a code that will work.”
George S. Habek, a senior analytical consultant for SAS, also helped students throughout the weekend.
“[My role is] to get the students to think about a step-by-step process to get analytics and prepare the data and get information and insight out of it,” Habek said. “So I’m here to give them tips and tricks and pointers and keep them motivated.”
Habek was happy with how students were working together.
“I see a lot of collaboration happening, [and] I think that was the intent,” Habek said. “And there’s a lot of brainstorming just like you would see in a work place, so I think it’s working out really well.”
Nazzaro further described what she finds exciting about DataFest.
“There is no better way to strengthen quantitative reasoning and computational skills than to be immersed in a problem with a ridiculous deadline and to be surrounded by people with a common goal,” she wrote.
On Sunday at 12 p.m., the students’ data analysis time ended. They prepared PowerPoint slides and four-minute presentations to show the judges what they found. After judging, organizers gave out an array of awards: Carlo Medina, Korkrid (Kyle) Akepanidtaworn, Amanda Yeoh ’19, Poom Chiarawongse ’19, and Joshua Su ’17 won Honorable Mention; Catherine Marquez ’16, Alexandra De Beaux ’17, Ariel Kaluzhny ’16, and Stephanie Ling ’16 won Best Insight; Jiachen Liang, Peter Tallcouch, Daniel Brink, and Nicholas Illenberger won Best Business Application; and Tiffany Coons ’18, Samara Prywes ’17, and Jack Gorman ’19 won Best in Show.
The judges included Jennifer McGinness from Boehringer Ingelheim; Barb Nangle from Yale University; Academic Computing Manager for the Social Sciences Jason Simms; and Sanvir Junnarkar and Alexander Hoyle from The Brattle Group.
Students and faculty alike were satisfied with the outcome of Wesleyan’s first DataFest. Oleinikov hopes that more competitions will be hosted, and that more students will participate in the future. He said that students may feel intimidated by the competition, but that no matter the amount of experience a student has with data analysis, they are eligible to compete.
“Often, you can make a story without doing something that requires 500 pages from a textbook,” Oleinikov said.
John Rissmiller, a sophomore at Connecticut College, was also content with his experience at DataFest.
“It went really well; we had a lot of fun,” Rissmiller said. “It was really challenging, but overall really interesting. I think we all learned a lot as far as data analysis goes, [as well as] data visualization techniques that we didn’t know before.”
Nazzaro added that she received a lot of positive feedback from the consultants.
“[The consultants] were impressed with how the students dealt with their frustrations. Their ability to bounce back and redirect after hitting a dead end says a lot about how they will handle real world research and data challenges.”