Finding patterns to make sense of complex data
Isabel Cruz makes sense of complex data.
“One of the aspects I work on is heterogeneous data — for example, how to make sense of data that is collected with different spatial and temporal granularities,” said Cruz, professor of computer science.
The U.S. Census, for example, collects data every 10 years, but other population data within a city may be taken every month or year.
“So if you want to tell the story of what’s happening in a city, you cannot put apples and oranges together,” she said.
Her job is to use semantics so that data make sense when put together.
“This is a huge challenge,” she said.
Through her research, Cruz extracts patterns from big data to perform predictive analytics.
“If you understand patterns, then you may be able to predict what is going to happen next,” she said. “What makes this possible is having a lot of data.”
Cruz is collaborating with the Chicago Department of Information Technology to help implement the “OpenGrid” initiative. She’s helping to integrate large datasets published on the city’s online portal.
“Chicago has been a leader in making data public and publishing it online — and that’s no minor achievement,” she said. “The next step is how to put together different datasets.”
The city publishes, for example, data on health inspections at establishments that serve food — restaurants, day cares, and other businesses.
“They will revisit businesses that have failed previous inspections so it is very important to put the data together so that we can see if the businesses have had other registered names, how long have they been there,” she said. “There are thousands of businesses in the city and there are all kinds of aliases for businesses. Putting all this together helps inspectors know where to go next.”
Last summer, Cruz was named a Grand Challenges Explorations winner, an initiative funded by the Bill & Melinda Gates Foundation, one of 28 winners among 1,500 applicants — less than a 2 percent acceptance rate. She is developing a data integration platform to monitor and predict the location of cases of malaria over time with her $100,000 award. The data will be used to drive targeted mosquito elimination efforts.
“In order to eliminate malaria, we need to be able to accurately monitor the changing patterns of infection across an entire region,” Cruz said. “Infection rates may be affected by multiple factors including the location of health centers, temperature, rainfall, type of landscape and population distribution.”
She developed her interest in databases while in graduate school at the University of Toronto, where she received her Ph.D. in 1994.
“I was fascinated by data modeling,” she said. “What brought me to big data was that it involved a lot of practical research — real projects like working with the city. While in graduate school, I developed the first graph query languages, a subject that is of great importance now because of social networks. That work still gets many citations.”
She joined UIC as an associate professor in 2001. She became a full professor in 2009.
What has kept her at UIC for 17 years?
“I love working with my students. They are the best and really keep me going,” she said.
“What brought me to UIC was the strength of its research, especially in database and information systems. It had and it has very well-known researchers, and now it’s an even stronger group in terms of big data and data science.”