In the CIS Lab we mostly deal with data analysis and mining algorithms and try to develop practical services and software in this regard. Some broad present and past research projects are described below.
Intelligent data extraction and analysis: Developing algorithms to automatically extract useful data from online sources is beneficial in many applications. This data can be leveraged for analytics and developing expert systems.
Data stream mining and incremental learning: With the progress of large-scale distributed systems, huge amounts of data are increasingly originating from dispersed sources in form of transient data streams. Traditional learning algorithms become impractical in this setting as sufficient training data may not be accessible. Also, concept drift may cause changes in data distribution which demands update in the learned model. Limited memory and high data entrance speed pose even more challenges for incremental learning. Incremental text clustering, incremental active learning, timeline summarization, and learning with deep models are among our active research projects in this area.
Demo: Analysis and clustering of textual data (link to?)
Text mining: Mining textual data from different aspects is important in various applications. Topic detection and tracking in social networks, text generation based on deep models, semantic analysis of incremental word embedding models, automatic labeling of textual clusters, etc. are among the projects in this area.
Demo: Text generation using GANs (link to: http://textGeneration.freehost.io/)
Data mining: Developing data mining algorithm to deal with different limitations encountered in real-world applications, and applying existing algorithms to analyze miscellaneous data (e.g. recommender systems, financial data, computer networks, social data, business processes) are actively studied in the CIS Lab.
Data analysis with deep learning: Exploring the capability of deep networks in different tasks such as incremental learning, data generation, network traffic analysis, spatio-temporal data, etc. offers insight to the capabilities and limitations of these models.