As large-scale models and datasets grow, techniques for data curation, training objective definition, and quality monitoring are essential. Mastering them ensures efficient dataset exploration, robust model development, and impactful decision-making in AI and computer vision workflows.
In today’s AI landscape, these techniques are crucial. Dimensionality reduction tools like t-SNE, UMAP and h-NNE pose a possibility to highlight meaningful structures in visualizations and draw insights. Clustering, on the other hand, organizes unstructured data into meaningful groups, aiding knowledge discovery, feature analysis, and retrieval-augmented generation.
These methods also address learned feature biases, errors, and redundancies affecting model performance. Dimensionality reduction, when done right, can help identify outliers and irregular patterns within the data. Robust clustering supports scalable embedding pipelines, enabling efficient data curation and querying. From k-means to DBSCAN and hierarchical approaches like FINCH, selecting the right method is key: including balancing scalability, managing noise sensitivity, and fitting computational demands.
This tutorial provides an in-depth exploration of the current state-of-the-art of data exploration techniques such as dimensionality reduction for data visualization and clustering methods, with a strong focus on their applications within computer vision. Attendees will gain a comprehensive understanding of both foundational and advanced techniques beyond classic methods like t-SNE and k-means. Through a blend of theoretical insights and hands-on applications, participants will learn how to effectively apply these methods to tasks such as big data analysis, representation learning, model development, pseudo-labeling, and data annotation.
Time Slot | Speaker | Talk Title |
---|---|---|
8:30 - 8:40 | Constantin Seibold | Opening Remarks |
8:40 - 9:25 | M. Saquib Sarfraz, Marios Koulakis | Clustering and its applications in modern computer vision |
9:25 - 10:10 | Laurens van der Maaten | How to use and not use modern dimension reduction techniques for data visualization |
10:10 - 10:20 | Coffee Break ☕ | |
10:20 - 11:05 | Brandon Duderstadt | Scaling dimensionality reduction with NOMAD projection |
11:05 - 12:00 | All Speakers | Q&A and Panel Discussion |
* Schedule is tentative and not yet confirmed.
We're excited to offer you a wealth of materials and interactive resources to enhance your experience during our tutorial. In addition to hands-on exercises and interactive Jupyter notebooks demonstrating cutting-edge techniques in dimensionality reduction and clustering for computer vision, we've prepared a variety of supportive resources to inspire and engage you.
We warmly invite you to explore the following resources, which have been thoughtfully curated to spark your curiosity and support your learning journey:
We are thrilled to welcome researchers, practitioners, and students to this half-day in-person tutorial. We hope these resources will spark new ideas and enhance your exploration of the exciting world of computer vision and data analysis.
For further information or inquiries, please contact the organizers via the social links provided in the speakers’ section.