Menu
A+ A A-

Machine Learning and Artificial Intelligence (AI)

Research Area Faculty

 

Research Area Overview

At CaSToRC, research actions in Machine Learning (ML) and Artificial Intelligence (AI) focus on the development of innovative algorithms that tackle core problems (in ML and AI), including generalization, interpretability, fairness, and efficiency.

In particular, research actions revolve around the design of novel machine learning algorithms that are robust, scalable, and efficient, addressing core challenges in the development of AI such as generalization, interpretability, fairness, and efficiency.

By leveraging expertise in areas such as representation (deep) learning, computer vision, and signal processing, the group develops methods that:

  • Draw inspiration from the brain and other biological systems to generate new machine learning architectures that are sparse, modular and hierarchical.
  • Reduce the computational requirements of training neural nets, increasing democratization and accessibility while reducing the carbon footprint of AI.
  • Interpret the inner workings of deep networks by designing the appropriate decompositions on network weights and activations.
  • Provide explainable decisions to build trust in user communities.
  • Incorporate prior knowledge and geometry-aware learning.
  • Mitigate classifier bias and increase fairness and diversity metrics.


These results and knowledge are transferred to critical interdisciplinary applications that relate among other to health and medical imaging, climate change, earth observation, smart farming, and others.

 

Research Highlights

Research Highlight 1

Title: Sparsity-Driven AI: Path Optimization and Hierarchical Modular Networks

Related people: Constantine Dovrolis and Shreyas Malakarjun Patil (PhD student at Georgia Tech)

Graphical Abstract

graphical abstract


Description
Sparse neural networks derived through PHEW and Neural Sculpting, highlighting hierarchical modularity and optimized paths, offer performance benefits in terms of generalization and learning.
 
Overview
This research presents novel approaches to enhancing neural networks' efficiency and interpretability by leveraging sparsity. PHEW (Paths with Higher Edge Weights) and Neural Sculpting tackle challenges in computational inefficiency and lack of transparency by exploring sparsity at different stages of the model lifecycle. These methods reveal how sparse networks can retain or even surpass the performance of their dense counterparts while being computationally lightweight and aligned with the structure of real-world tasks.
 
Scientific Achievement
PHEW introduces a probabilistic method for identifying sparse sub-networks during initialization, relying on biased random walks to prioritize high-weight paths. This results in sparse networks that are computationally efficient and exhibit strong generalization capabilities without requiring training data.
Neural Sculpting complements this by iteratively pruning networks during training to expose their inherent hierarchical modularity. This method not only enhances model interpretability by aligning network structures with task hierarchies but also reduces redundancy, boosting overall performance.
 
Significance and impact
These methods represent a significant step forward in the development of sustainable and interpretable AI. By optimizing neural network structures, they enable faster training, lower computational costs, and improved scalability, particularly for edge devices and resource-constrained environments. Additionally, Neural Sculpting's ability to reveal hierarchical modularity enhances the transparency of AI systems, fostering trust and explainability in applications ranging from healthcare to autonomous systems.
 
Research Details
  • PHEW (Paths with Higher Edge Weights):
    • Uses biased random walks to identify high-weight paths within dense networks.
    • Constructs sparse sub-networks that retain critical architectural properties.
    • Demonstrates robust generalization and fast convergence.
  • Neural Sculpting:
    • Iteratively prunes network connections to uncover hierarchical modularity.
    • Aligns network architecture with task-specific structures.
    • Enhances interpretability and reduces overfitting in data-scarce scenarios.

Reference
s
  • Patil, S. M., & Dovrolis, C. (2021). PHEW: Constructing Sparse Networks That Learn Fast and Generalize Well. International Conference on Machine Learning (ICML). DOI
  • Patil, S. M., & Dovrolis, C. (2023). Neural Sculpting: Uncovering Hierarchically Modular Task Structure in Neural Networks Through Pruning and Network Analysis. Neural Information Processing Systems (NeurIPS). DOI

Research Highlight 2

Title: Neuro-Inspired Architectures for Continual Learning

Related people: Constantine Dovrolis and Burak Gürbüz (PhD student at Georgia Tech)

Graphical Abstract

graphical abstract

Description
Our research in neuro-inspired AI draws inspiration from what is currently known about the brain (and other biological systems) to design new architectures, learning algorithms, and to provide a deeper understanding of the relation between structure and function in neural networks.
 
Overview
In recent years, neuro-inspired artificial intelligence has taken a significant leap forward. The architectures NISPA and NICE address challenges in continual learning without the need for traditional replay mechanisms. Inspired by brain plasticity and neurogenesis, the processes by which new synapses and neurons are formed in the brain, these models aim to overcome the limitations of existing neural networks architectures.
 
Scientific Achievement
The NISPA (Neuro-Inspired Stability-Plasticity Adaptation) and NICE (Neurogenesis Inspired Contextual Encoding) frameworks introduce biologically inspired mechanisms that enable neural networks to adapt incrementally to new tasks while preserving performance on previously learned ones. By leveraging modular, sparse architectures and a neurogenesis-inspired strategy, these models achieve state-of-the-art performance in class-incremental learning without requiring replay of prior data, which is critical in privacy-sensitive applications.
 
Significance and Impact
These innovations significantly advance the field of continual learning, providing a biologically plausible solution to catastrophic forgetting—a key challenge in artificial intelligence. The absence of replay mechanisms not only aligns with ethical concerns in handling private data but also drastically reduces memory and computational demands. These developments could transform applications in areas such as medical diagnostics, adaptive robotics, and personalized learning systems.
 
Research Details
  1. NISPA Framework (ICML 2022): This framework focuses on balancing stability and plasticity in neural networks through modularity and sparse connectivity. NISPA dynamically adjusts its structure in response to new tasks, preserving critical connections for prior tasks while enabling adaptability.
  2. NICE Framework (CVPR 2024): NICE extends these ideas with a neurogenesis-inspired approach, where new "neural modules" are selectively generated to handle task-specific features. By contextual encoding, NICE ensures that knowledge remains disentangled across tasks, achieving superior accuracy in class-incremental learning benchmarks.
References
  1. Gürbüz, B., & Dovrolis, C. (2022). NISPA: Neuro-inspired Stability-Plasticity Adaptation for Continual Learning in Sparse Networks. International Conference on Machine Learning (ICML). DOI
  2. Gürbüz, B., & Dovrolis, C. (2024). NICE: Neurogenesis Inspired Contextual Encoding for Replay-Free Class Incremental Learning. Conference on Computer Vision and Pattern Recognition (CVPR). DOI


 


Selected Publications

  • Y. Panagakis et al., ‘Tensor Methods in Computer Vision and Deep Learning’, Proc. IEEE, vol. 109, no. 5, pp. 863–890, May 2021, doi: 10.1109/JPROC.2021.3074329.
  • M. B. Gurbuz and C. Dovrolis, ‘NISPA: Neuro-inspired stability-plasticity adaptation for continual learning in sparse networks’, International Conference of Machine Learning (ICML), 2022. Available: https://icml.cc/virtual/2022/spotlight/16096
  • J. Oldfield, C. Tzelepis, Y. Panagakis, M. A. Nicolaou, and I. Patras, ‘PandA: Unsupervised Learning of Parts and Appearances in the Feature Maps of GANs’, Feb. 06, 2023, arXiv: arXiv:2206.00048. Accessed: Nov. 21, 2024. Available: http://arxiv.org/abs/2206.00048
  • J. Oldfield, C. Tzelepis, Y. Panagakis, M. A. Nicolaou, and I. Patras, ‘Parts of Speech-Grounded Subspaces in Vision-Language Models’, Nov. 12, 2023, arXiv: arXiv:2305.14053. Accessed: Nov. 21, 2024. [Online]. Available: http://arxiv.org/abs/2305.14053
  • S. M. Patil, L. Michael, and C. Dovrolis, ‘Neural Sculpting: Uncovering hierarchically modular task structure in neural networks through pruning and network analysis’, Neural Information Processing Systems (NeurIPS) conference, 2023. Available: http://arxiv.org/abs/2305.18402
  • M. B. Gurbuz, J. M. Moorman, and C. Dovrolis, ‘NICE: Neurogenesis Inspired Contextual Encoding for Replay-free Class Incremental Learning’, in 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). DOI: 10.1109/CVPR52733.2024.02233.

 

 

Publications & Media