zIFBoards - Free Forum Hosting
Free Forums. Reliable service with over 8 years of experience.

Learn More · Register for Free
Welcome to Dozensonline. We hope you enjoy your visit.
You're currently viewing our forum as a guest. This means you are limited to certain areas of the board and there are some features you can't use. If you join our community, you'll be able to access member-only sections, and use many member-only features such as customizing your profile, and sending personal messages. Registration is simple, fast, and completely free. (You will be asked to confirm your email address before we sign you on.)
Join our community!
If you're already a member please log in to your account to access all of our features:

Name:   Password:


Forum Rules RULES~ "Music"

Please pay respect to other people's points of view, whether you agree with them or not, and just generally try to act civiilly.

Don't post any pornography, warez, or other offensive or prohibited materials.

Postings can be about any facet of music whatsoever.

 

 Developing A New Audio File Format
Dan
Posted: Dec 7 2017, 04:31 AM


Dozens Disciple


Group: Members
Posts: 1,463
Member No.: 19
Joined: 8-August 05



You may be thinking: Why? Aren't there enough digital audio formats already?

But the thing is, I want a format that's simple. Like WAV, except based on frequency instead of time-domain signals. I want simple math. (Well, if you count linear algebra and Fourier analysis as "simple".) I want any programmer to be able to generate "chiptune" music or make simple musical transformations like "make everything an octave lower" without specialized libraries.

Perhaps such a thing already exists; if so, please let me know. Until then, here are my notes for the design of a new format, which I am tentatively naming "LFE" for "Logarithmic Frequency Encoding".

Basis Frequencies

Each time-block of music will be represented as a sum of "basis" waves.

The basis frequencies will be equally logarithmically spaced, with *100 (144) frequencies per octave. The interval between each frequency is thus 8.33... "cents", approximating a "just noticeable difference" in frequency.

The specific basis frequencies are:

F(n) = 2^((n-972)/144)*440 Hz

Note the bias towards the A440 12-EDO pitch standard: Music using this tuning (and consisting only of sine waves) can be exactly represented. The numbering n is arbitrarily chosen to be exactly a dozen times the TGM note number.

Since computer memory is a finite resource, we'll need to impose a reasonable range constraints on n.

If we want our audio files to be converted to or from CD's, then we'll have to deal with their standard 44 100 Hz sampling rate and thus a Nyquist frequency of 22050 Hz. It turns out that n = 1785 = *1049 is the highest we can go.

What about the lower end? It's frequently stated that the "normal" lower limit of human hearing is 20 Hz, which is approximated by F(330) = 20.015231264080082 Hz. Using this as our cutoff frequency gives us 1456 basis frequencies to work with, or a span of *A.14 octaves.

But that's kind of a "weird" number, so let's use a lower limit of F(273) = 15.212581077221454 Hz instead, giving us a "rounder" *A.6 octaves.

With 273 <= n <= 1785, there are 1513 different basis frequencies.

Perhaps it would be a good idea to make the lower and upper frequencies configurable in the file format, to allow "compression" by band-limiting the signal, or to allow representation of infrasound and ultrasound by expanding the frequency range. But for now, I shall assume the above numbers as a sensible "default".

To be continued...
Top
Dan
Posted: Dec 7 2017, 05:47 AM


Dozens Disciple


Group: Members
Posts: 1,463
Member No.: 19
Joined: 8-August 05



Length of a time block

How many time-domain samples will be used to generate each set of frequency-domain samples? If we use too large of a time block, then fast music can't be represented accurately. OTOH, if we use too small of a time block, then file sizes will be huge.

If we define an LFE time block as *0.1 second (or 3675 CD audio samples), then with 1513 basis frequencies, we will need 12*1513*2 = 36 312 frequency-domain samples per second, compatible in size to the original CD audio format. (The reason for the multiplication by 2 is that a complex representation, like cos and sin components, is needed to be able to represent the phase of a wave.)

But a 1/12 second time quantum is rather long for fast music. In moderate-speed music with a "whole note" around 2 seconds, you're limited to 16th notes or triplet "24th notes" at best. Some pieces use 32nd or 64th notes.

If we go an order of magnitude faster, defining an LFE time block as *0.01 second (or 306.25 CD audio samples), then we will need 435 744 frequency-domain samples per second. That's a huge file size, but we may be stuck with it.

FWIW, the popular MP3 format uses 576 (or *400 was this format designed by dozenalists?) time-domain samples for each block of frequency-domain samples. That works out to about 13 ms or 0;01A6A second.



At this point, I open the floor for comments.
Top
Double sharp
Posted: Dec 7 2017, 07:47 AM


Dozens Disciple


Group: Members
Posts: 1,401
Member No.: 1,150
Joined: 19-September 15



We are indeed stuck with 1/144 s, though not because of 128th and 256th notes (128ths are common in slow music of the Classical period, if rarer than 64ths, while 256ths are rare but not unheard of, mostly because they appear in Beethoven's Third Piano Concerto). Rather it's because of piano glissandi, which are composed of discrete notes and can go really fast, though not as fast as single biciaseconds.
Top
Dan
Posted: Dec 8 2017, 05:53 AM


Dozens Disciple


Group: Members
Posts: 1,463
Member No.: 19
Joined: 8-August 05



One possibility is to represent an audio file as a raster image, with one dimension having one pixel for each basis frequency, and the other dimension representing time, with one pixel per biciasecond time block. The color of each pixel would have a brightness representing its magnitude and the hue (as in the HSL or HSV color model) representing the phase angle.

We could then use conventional image compression methods, whether lossless (PNG) or lossy (JPEG) to compress music for us.

If nothing else, it would give us a nice way of auto-generating graphic notation for music.
Top
« Next Oldest | Music Notation | Next Newest »
zIFBoards - Free Forum Hosting
Join the millions that use us for their forum communities. Create your own forum today.
Learn More · Sign-up for Free

Topic Options



Hosted for free by zIFBoards* (Terms of Use: Updated 2/10/2010) | Powered by Invision Power Board v1.3 Final © 2003 IPS, Inc.
Page creation time: 0.0352 seconds · Archive