Mechanical Music Digest  Archives
You Are Not Logged In Login/Get New Account
Please Log In. Accounts are free!
Logged In users are granted additional features including a more current version of the Archives and a simplified process for submitting articles.
Home Archives Calendar Gallery Store Links Info

Spring Fundraising Drive In Progress. Please visit our home page to see this and other announcements: https://www.mmdigest.com     Thank you. --Jody

MMD > Archives > May 2023 > 2023.05.12 > 02Prev  Next


Converting Audio to MIDI Data File
By Julie Porter

I see there has been some use of so called AI to convert audio to
MIDI. Actually, for solo piano in a controlled environment, this
has been a laboratory exercise for several decades.

I have been playing around with the fundamentals for years. Warren
Trachtman used Bayesian estimators and statistical error correction
into the RollScan converter program that converts roll perforations
to MIDI.

I also noticed that my programs for creating pipe organ definition
files to play scanned pipe organ rolls uses fuzzy logic and other
statistical means so that rolls scanned on one instrument can play
on a different one -- more often or no an emulator such as Haupwerk
or jOrgan.

The only thing new about the current so called AI fad is that it
is the current buzzword-du-jour to get banks to loan money or seek
research grants. A few years back it was NFTs and bitcoin. Some
will even remember when collecting antique instruments was a sure
way to beat the stock market returns.

This is just a new name for old wine. Many businesses like medical
have been using expert systems for decades. Elizabeth Holmes went
to prison for misusing this tech. (Her claims are probably valid,
but she was too soon to make them.) I used some of the same blood
count image analysis to find centers of overlapping circles in piano
roll scans.

I looked at 30-year-old textbooks and they could have been written
yesterday, or this very instant, or tomorrow. What we now have is
the effect of Moore's law on these formulae -- some of what are
centuries old.

The cams that drove the automatons and early computing (especially
the fortune telling magicians and slot machines) that allowed dolls
to write and draw are based on the same mathematics used by Bernoulli
and Fourier. I call these phase circle infinities: the sums of sine
waves to infinity. Some call them imaginary numbers (as they are
based on the square root of -1).

What we are seeing hearing and feeling is illusion, although any
electrical engineer will tell you that the shock from a charged
capacitor/condenser is no illusion! The real question is what happens
to the MIDI when it is edited in Cakewalk [MIDI file editor]? Can the
different sections be extracted?

In some effect this is not much different than Mozart's musical dice
game, or the Panharmonum I saw and heard demonstrated in one of the
European museums. (I wonder what happens when AI gets a gambling
addiction attempting to predict lottery numbers?)

Most of my work (which I gave a lecture on last year at the Second
Global Piano Roll Conference) was in using Postscript to directly
process MIDI. This had the effect of being able to render MIDI as
sheet music, which for the most part is unplayable as the data is
overfit to the page.

This is an area of concern. There is risk of bias in the training set.
The results get old quickly as they only respect the existing data.

I was also surprised how many there did not know that there are apps
for playing grooved recordings photographed with an iPhone (or other
smartphone.) The recovery of event data from such programs was one of
my predictions. Do to time limitations I had to cut most of this out
of my presentation.

I was most interested when I rendered the scanned roll music as
waveform (Using what are called sound font samples.) How the discreet
samples combine to come together as a performance recording. The
Apple sound engine also renders MIDI sounds this way. One reason the
built-in Apple MIDI player does not work in real time as it used to do.

One of the things I would like to do is to recover the equivalent of
sound fonts from wave recordings. An emulator such as jOrgan or
Haupwerk is only as good as the underlying sound samples.

It is also interesting that the resultant MIDI from the GIANTMIDI-Piano
project has problems with the pedaling cut off. This was an issue with
the solenoid floppy player I worked on in the early 2000s, mostly as
the note event fades to infinity.

I also noticed that the pedal solenoid (and the player) uses gravity
and time to catch the dampers. This is often used to compensate for
temperature and the room acoustics, which vary.

It will be interesting to see what all this leads to.

Julie Porter
Martinez, California


(Message sent Sat 13 May 2023, 19:19:54 GMT, from time zone GMT-0700.)

Key Words in Subject:  Audio, Converting, Data, File, MIDI

Home    Archives    Calendar    Gallery    Store    Links    Info   


Enter text below to search the MMD Website with Google



CONTACT FORM: Click HERE to write to the editor, or to post a message about Mechanical Musical Instruments to the MMD

Unless otherwise noted, all opinions are those of the individual authors and may not represent those of the editors. Compilation copyright 1995-2024 by Jody Kravitz.

Please read our Republication Policy before copying information from or creating links to this web site.

Click HERE to contact the webmaster regarding problems with the website.

Please support publication of the MMD by donating online

Please Support Publication of the MMD with your Generous Donation

Pay via PayPal

No PayPal account required

                                     
Translate This Page