Austin Heinz - Proposal on Musical Data Exploration

I. Overture: The Vision

Classical music, particularly the piano repertoire, thrives on the nuance of interpretation. The same score yields vastly different experiences in the hands of different artists across time. While the practice of comparative listening—comparing different interpretations—is central to deep appreciation and learning, it faces significant hurdles. Human listeners are limited by time, effort, attention span, memory, and cognitive biases when attempting to analyze numerous recordings.

Furthermore, the sheer scale of available recordings presents a daunting challenge; hundreds of acclaimed interpretations exist for countless pieces, making thorough manual comparison practically impossible, especially for longer works. This proposal outlines a PhD project aimed at overcoming these limitations by developing a novel machine learning methodology coupled with an intuitive frontend interface to facilitate the exploration and comparative analysis of classical piano recordings at scale.

"The goal is not merely technological advancement, but to deepen our connection to musical artistry, making the profound experience of comparative listening more accessible and fostering a renewed appreciation for classical music."

[Expand here with your 500-word speculative overview. Discuss the core idea, the passion behind it, and the potential impact. Mention the convergence of your interests: cataloguing, musical impact, technology, humanities.]

II. Vision & Relevance

This project addresses a critical intersection of artistic tradition, technological opportunity, and cultural need. Its relevance stems from several key areas:

Artistic Experience

Enhancing the core experience of classical music: appreciating diverse interpretations. Making the 'magic' of specific renderings more tangible and understandable, overcoming the limitations of manual listening.

Technological Advancement

Pushing the boundaries of Music Information Retrieval (MIR) by tackling nuanced comparative analysis. Piano data, focusing on timing, dynamics, and potentially pedaling, provides a rich, constrained testbed for methods potentially applicable to other music.

Cultural Revitalization

Addressing declining engagement by lowering the barrier to entry for deep listening. Creating tools that can spark curiosity and sustain interest in the art form for future generations by efficiently navigating the vast recording landscape.

Pedagogical Potential

Providing valuable tools for music students and educators to study performance practice, compare interpretations efficiently, and develop critical listening skills beyond what is feasible manually.

FINER Criteria

Feasible: Leverages existing audio processing/ML advancements; phased approach manages complexity. Piano music offers a suitable starting point due to its characteristics.
Interesting: Addresses core classical music appreciation; novel problem in music informatics concerning interpretation at scale.
Novel: Focus on comparative interpretation at scale with dedicated frontend is unique, building on prior work but aiming for greater accessibility and depth.
Ethical: Promotes cultural heritage/accessibility; requires careful consideration of copyright and potential algorithmic biases.
Relevant: Addresses artistic, technological, cultural, and educational needs highlighted by the challenges of manual comparative listening.

[Elaborate on the relevance points from your notes. Explain why humans struggle with scale and why now is the right time technologically. Mention anecdotes about listener difficulties.]

III. Research Questions

How can machine learning methodologies effectively detect, quantify, and characterize interpretive differences (e.g., timing, dynamics, articulation, potentially pedaling proxies) between performances of the same classical piano composition, scalable to large datasets?
What visualization and interaction paradigms, integrated with audio playback, best communicate these nuanced interpretive differences to both expert and novice listeners in an engaging and informative manner?
How can an interactive comparative listening tool be designed to encourage exploration, discovery, and deeper engagement with the classical piano repertoire, potentially overcoming barriers faced by new listeners?
What are the potential pedagogical applications and impacts of using such computational tools in music education settings for teaching performance practice and listening skills?

IV. Project Connections

This research connects multiple domains, aiming to create value at their intersection:

V. Proposed Methodology

This research employs a mixed-methods approach, combining computational analysis with human-centered design and evaluation. The focus begins with solo piano music, particularly pieces like the Chopin Mazurkas, known for their suitability for analysis due to numerous recordings, clear structure, and relative brevity, allowing for methodological refinement before potential expansion.

1. Machine Learning Core

Develop a pipeline for:

Precise audio alignment (e.g., DTW variants, sequence models) to compare equivalent passages despite tempo variations.
Extraction of musically relevant features primarily focused on timing and dynamics (volume), potentially incorporating pedaling proxies, approximating MIDI-like data capture.
Difference modeling using techniques like contrastive learning, attention mechanisms, Bayesian methods (e.g., inspired by Guichaoua et al., 2024), or specialized network architectures to identify and quantify interpretive variations.
Ensuring scalability for large recording datasets like MazurkaBL or beyond.

[Provide more technical detail here. What specific models or techniques are you considering? What are the challenges? Reference relevant existing work you can build upon.]

2. Frontend Exploration Tool

Design and build an interactive web interface featuring:

Intuitive recording selection and comparison setup.
Novel visualizations synchronized with audio playback (e.g., comparative timelines, dynamic contours, feature highlighting inspired by Cook (2007) or Zhou & Fabian (2021) but automated and interactive).
Potential for annotation/tagging (manual or automated).
Navigation based on similarity or specific performance characteristics, potentially identifying outliers or clusters of interpretations.

[Describe the user experience goals. How will it make comparison intuitive and insightful? What technologies might be used (e.g., React, D3.js, Web Audio API)?]

3. Evaluation

Assess the system through:

Quantitative evaluation of ML model accuracy against ground truth or established datasets.
Qualitative and quantitative user studies with diverse participants (novices, experts, educators) to evaluate the frontend's usability, effectiveness, and impact on engagement, perhaps drawing inspiration from Repp's (1997) experimental approach but focused on the tool.
Case studies exploring pedagogical applications and potential for enhancing music appreciation.

VI. Envisioning the Tool: Interface Mockup

The following mockup illustrates the core concepts of the proposed interactive exploration tool:

Comparative Piano Explorer

Chopin: Nocturne in E-flat Major, Op. 9, No. 2

Performance Timeline & Visualization

(Imagine synchronized waveforms, tempo graphs comparing multiple recordings, dynamic curves, or feature highlights here, linked to playback)

Selected Recordings

Arthur Rubinstein (1937)
Vladimir Horowitz (1957)
Maria João Pires (1996)
(+ Add / Filter...)

Performance Insights (at Playhead)

Notable tempo differences in this phrase. Horowitz uses significant rubato (red line deviation), while Pires (green) maintains a brisker, more consistent pace. Dynamic range wider in Rubinstein's recording here (visualized elsewhere).

▶ Play / Pause

Compare Features

⚙ Settings

[Briefly explain what the mockup demonstrates, e.g., comparing three Chopin nocturne recordings, visualizing tempo/dynamics, showing contextual insights.]

VII. Standing on Shoulders: Literature Review

This project builds upon decades of research evolving from manual analysis to sophisticated computational methods in Music Information Retrieval (MIR), audio signal processing, machine learning, and computational musicology. Early efforts often involved manual data creation, while recent work leverages algorithmic processing and large datasets.

Foundations in Performance Analysis

Early influential work involved breaking down performances into core components. Repp (1997) conducted experiments using digitally manipulated piano performances to isolate and evaluate specific attributes like timing and dynamics, even creating an "average" performance computationally. Cook (2007) applied early computational methods alongside manual analysis to study Chopin Mazurkas, highlighting the potential and challenges.

Key Citations: Repp (1997), Cook (2007)

Methodological Development & Datasets

The field has seen numerous methodological proposals, often tested via case studies. Piano music, especially Chopin's Mazurkas, became a common focus due to its suitability for developing analytical techniques (Sapp, 2008; Kosta et al., 2018; Guichaoua et al., 2024). Methodologies evolved from hybrid numeric/rank metrics (Sapp, 2008) and manual/algorithmic approaches (Cook, 2007) to complex algorithmic techniques like hierarchical multidimensional scaling for orchestral works (Yanchenko & Hoff, 2020) and end-to-end Bayesian segmentation for piano (Guichaoua et al., 2024). The creation of large, annotated datasets like MazurkaBL (Kosta et al., 2018), containing score-aligned data for thousands of recordings, has been crucial for training and validating modern approaches.

Key Citations: Sapp (2008), Kosta et al. (2018), Yanchenko & Hoff (2020), Guichaoua et al. (2024), Zhou & Fabian (2021)

Machine Learning & Modern Approaches

Recent advancements leverage machine learning, including neural networks for tasks like score following (Dorfer et al., 2016) and deep learning for feature extraction and comparison (Zhang et al., 2024). Techniques from other domains, such as image-based analysis (Liem & Hanjalic, 2015) or methods like Siamese networks and contrastive learning, are being adapted for nuanced audio comparison. There's significant overlap with AI music generation research, as both fields tackle the challenge of deconstructing and understanding musical components.

Key Citations: Dorfer et al. (2016), Liem & Hanjalic (2015), Zhang et al. (2024), [Relevant ML/audio papers]

Interactive Interfaces & Visualization

While less common than methodological papers, research in HCI and music visualization informs the design of tools for exploring this data. Effective interfaces are needed to translate complex analyses into meaningful experiences for users.

Key Citations: [Relevant HCI/visualization papers, e.g., from CHI, NIME, VIS]

[This section requires further expansion with more specific details from the cited papers and potentially others found during a deeper review, clearly identifying the specific gap this project aims to fill.]

VIII. The Researcher: Preparedness

My background provides a unique foundation for this interdisciplinary project:

Academic Synergy: Double degree in Informatics and German Studies, plus prior Computer Science and Music studies, demonstrating a long-term commitment to integrating technology, data analysis, arts, and humanities.
Proven Research Initiative: Successfully designed and executed self-guided research projects (e.g., large independent study, Spotify API analysis, Schubert sentiment analysis), showcasing independent research capabilities.
Relevant Technical Skills: Proficiency in [Python, relevant ML libraries (e.g., Librosa, PyTorch/TensorFlow), data visualization tools, web development basics (HTML/CSS/JS)? Be specific!].
Deep Domain Passion: A decade of passionate engagement with classical piano, including broadcasting, extensive comparative listening ([X] records curated), and translation work focused on accessibility. This project directly addresses long-held research interests.
Communication Aptitude: Experience translating complex musical ideas into accessible formats for broad audiences (e.g., via broadcasting).

[Expand on specific projects mentioned in your notes. Emphasize the interdisciplinary nature and the long-term passion for this specific problem.]

IX. The Setting: Why University of Washington?

The University of Washington offers an unparalleled environment for this research:

Interdisciplinary Hub: UW excels at fostering collaboration across departments like the iSchool, Computer Science & Engineering, Music, and DXARTS, providing access to world-class faculty in ML, HCI, MIR, digital humanities, and musicology.
Leading Research: Home to leading researchers and labs in relevant areas [Mention specific professors, labs, or groups whose work aligns, e.g., Prof. Anna Preus if applicable, ML/AI groups, HCI labs, MIR researchers].
Technological Ecosystem: Seattle's position as a global tech hub provides a rich environment, potential industry connections, and access to cutting-edge developments.
Vibrant Arts Community: The city's strong support for the arts, including the renowned Seattle Symphony, offers potential for local impact, collaboration, and user study recruitment.

[Tailor this heavily based on the specific program/department you target. Why is *this* specific place the best fit for *this* project and *your* background?]

X. The Path Forward: Timeline & Milestones

A phased approach ensures manageable progress and allows for adaptation:

Phase 1 (Year 1): Foundation & Methodology

Deepen literature review, refine research questions, collect initial dataset (e.g., focusing on Chopin Mazurkas). Develop and validate core audio alignment and feature extraction pipeline. Implement first-generation difference detection models.

Milestone: 'First-generation' processing methodology; preliminary results presentable at [e.g., ISMIR Doctoral Symposium, relevant workshop].

Phase 1 (Year 2): Model Refinement & Prototyping

Refine ML models (e.g., exploring Bayesian or contrastive approaches), explore advanced architectures. Begin design and early prototyping of the interactive frontend. Conduct initial feasibility tests and small-scale evaluations.

Milestone: Improved ML models; low-fidelity frontend prototype; potential conference paper [e.g., ISMIR, ICASSP].

Phase 2 (Year 3): Frontend Development & Integration

Develop functional web-based exploration tool, integrating ML backend analysis. Design and implement core visualizations and interactions. Begin usability testing with target user groups.

Milestone: Working prototype of the interactive tool; publication on methodology/prototype [e.g., TISMIR, Journal of New Music Research].

Phase 2 (Year 4): User Evaluation & Scaling

Conduct comprehensive user studies (qualitative/quantitative) with diverse participants. Scale analysis pipeline to larger corpus/different repertoire. Refine tool based on feedback. Explore initial pedagogical applications.

Milestone: User study results published [e.g., CHI, NIME, Music Perception]; robust analysis pipeline.

Phase 3 (Year 5+): Expansion, Synthesis & Dissemination

Explore potential expansions (e.g., other instruments, broader analysis features). Address revitalization goals more directly. Finalize research, write and defend dissertation. Seek high-impact publications.

Milestone: Completed dissertation; publications in top-tier venues; potential public release/outreach component; grant applications for future work.

[Consider adding risk management points. Mention potential for international study if desired.]

XI. Resonance: Potential Impact

This project aims to contribute significantly by unlocking the insights currently hidden within vast archives of recordings:

Music Information Retrieval (MIR): Advancing computational understanding of musical performance nuance and developing novel methods for scalable comparative audio analysis.
Human-Computer Interaction (HCI): Designing and evaluating novel interactive systems for exploring complex, time-based artistic data.
Digital Humanities: Providing new computational lenses for studying cultural heritage, artistic interpretation, and performance practice evolution.
Music Education: Creating innovative tools to support the teaching and learning of performance practice and critical listening, moving beyond anecdotal comparisons.
Cultural Engagement: Fostering broader appreciation and deeper engagement with classical music by making its interpretive richness more accessible and navigable, potentially attracting new audiences.

[Connect back to the goal of revitalizing classical music. How might this project specifically contribute to that? Consider the 'hook' for new listeners.]

XII. Harmonia: Ethical Considerations

Ethical conduct is paramount. While aiming to enhance appreciation, potential risks must be addressed:

Copyright: Primarily utilizing commercially available recordings for analysis under fair use principles for research. Focusing on feature analysis rather than audio redistribution unless licensed. Ensuring proper attribution to artists and labels.
Algorithmic Bias & Impact on Artistry: Acknowledging potential biases in recording availability and algorithmic analysis. Critically considering how analytical tools might influence performance practice – avoiding over-centralization around an "average" or conversely, over-valuing algorithmically flagged "uniqueness" at the expense of musicality. The goal is to provide insight, not dictate artistic norms.
Data Representation: Ensuring that the chosen features and models capture musically meaningful aspects of interpretation, rather than arbitrary signal characteristics.
User Privacy: Ensuring anonymity and data security if user interaction data is collected during evaluations.
Accessibility: Designing the frontend tool with accessibility principles in mind for diverse users.

XIII. Coda: References

{/* Reusing style for consistency */}

[A comprehensive list of cited works will be included here in standard academic format (e.g., APA, Chicago). This includes key papers mentioned above and others identified during the full review.]

Cook, N. (2007). Performance analysis and Chopin's mazurkas. Musicae Scientiae, 11(2), 183-207. https://doi.org/10.1177/102986490701100203
Dorfer, M., Arzt, A., & Widmer, G. (2016). Towards score following in sheet music images. Proceedings of the 17th International Society for Music Information Retrieval Conference (ISMIR). https://doi.org/10.48550/arXiv.1612.05050
Guichaoua, C., Lascabettes, P., & Chew, E. (2024). End-to-end Bayesian segmentation and similarity assessment of performed music tempo and dynamics without score information. Music & Science, 7. https://doi.org/10.1177/20592043241233411
Kosta, K., Bandtlow, O. F., & Chew, E. (2018). MazurkaBL: Score-aligned loudness, beat, and expressive markings data for 2,000 Chopin mazurka recordings. Proceedings of the 4th International Conference on Technologies for Music Notation and Representation (TENOR), 85-94.
Liem, C. C. S., & Hanjalic, A. (2015). Comparative analysis of orchestral performance recordings: An image-based approach. Proceedings of the 16th International Society for Music Information Retrieval Conference (ISMIR), 302-308.
Müller, M. (2015). Fundamentals of Music Processing. Springer.
Repp, B. H. (1997). The aesthetic quality of a quantitatively average music performance: Two preliminary experiments. Music Perception, 14(4), 419-444. https://doi.org/10.2307/40285732
Sapp, C. S. (2008). Hybrid numeric/rank similarity metrics for musical performance analysis. Proceedings of the 9th International Society for Music Information Retrieval Conference (ISMIR), 501-506.
Widmer, G. (2016). Getting Closer to the Essence of Music: The Mazurka Project. ACM Transactions on Intelligent Systems and Technology (TIST), 7(3), 1-26. https://doi.org/10.1145/2724718
Yanchenko, A. K., & Hoff, P. D. (2020). Hierarchical multidimensional scaling for the comparison of musical performance styles. Annals of Applied Statistics, 14(4), 1581-1603. https://doi.org/10.1214/20-AOAS1391
Zhang, H., Liang, J., & Dixon, S. (2024). From audio encoders to piano judges: benchmarking performance understanding for solo piano. arXiv preprint arXiv:2407.04518. https://doi.org/10.48550/arXiv.2407.04518
Zhou, D. Q., & Fabian, D. (2021). A three-dimensional model for evaluating individual differences in tempo and tempo variation in musical performance. Musicae Scientiae, 25(2), 252-267. https://doi.org/10.1177/1029864919873193
[Add many more placeholder or actual key references from a full literature search]

Algorithmic Vistas: Exploring Classical Piano Interpretation

Methodological Horizons towards Exploration of Individual Musical Interpretation & the Future Vitality of Classical Music