hetro takes an input soundfile, decomposes it into component sinusoids, and outputs a description of the components in the form of breakpoint amplitude and frequency tracks. Analysis is conditioned by the control flags below. A space is optional between flag and value.
-s srate -- sampling rate of the audio input file. This will over-ride the srate of the soundfile header, which otherwise applies. If neither is present, the default is 10000. Note that for adsyn synthesis the srate of the source file and the generating orchestra need not be the same.
-c channel -- channel number sought. The default is 1.
-b begin -- beginning time (in seconds) of the audio segment to be analyzed. The default is 0.0
-d duration -- duration (in seconds) of the audio segment to be analyzed. The default of 0.0 means to the end of the file. Maximum length is 32.766 seconds.
-f begfreq -- estimated starting frequency of the fundamental, necessary to initialize the filter analysis. The default is 100 (cps).
-h partials -- number of harmonic partials sought in the audio file. Default is 10, maximum is a function of memory available.
-M maxamp -- maximum amplitude summed across all concurrent tracks. The default is 32767.
-m minamp -- amplitude threshold below which a single pair of amplitude/frequency tracks is considered dormant and will not contribute to output summation. Typical values: 128 (48 db down from full scale), 64 (54 db down), 32 (60 db down), 0 (no thresholding). The default threshold is 64 (54 db down).
-n brkpts -- initial number of analysis breakpoints in each amplitude and frequency track, prior to thresholding (-m) and linear breakpoint consolidation. The initial points are spread evenly over the duration. The default is 256.
-l cutfreq -- substitute a 3rd order Butterworth low-pass filter with cutoff frequency cutfreq (in Hz), in place of the default averaging comb filter. The default is 0 (don't use).
As of Csound 4.08, hetro can write SDIF output files if the output file name ends with ".sdif" or ".SDIF". See the sdif2ad utility for more information about the Csound's SDIF support.
The output file contains time-sequenced amplitude and frequency values for each partial of an additive complex audio source. The information is in the form of breakpoints (time, value, time, value, ....) using 16-bit integers in the range 0 - 32767. Time is given in milliseconds, and frequency in Hertz (cps). The breakpoint data is exclusively non-negative, and the values -1 and -2 uniquely signify the start of new amplitude and frequency tracks. A track is terminated by the value 32767. Before being written out, each track is data-reduced by amplitude thresholding and linear breakpoint consolidation.
A component partial is defined by two breakpoint sets: an amplitude set, and a frequency set. Within a composite file these sets may appear in any order (amplitude, frequency, amplitude ....; or amplitude, amplitude..., then frequency, frequency,...). During adsyn resynthesis the sets are automatically paired (amplitude, frequency) from the order in which they were found. There should be an equal number of each.
A legal adsyn control file could have following format:
-1 time1 value1 ... timeK valueK 32767 ; amplitude breakpoints for partial 1 -2 time1 value1 ... timeL valueL 32767 ; frequency breakpoints for partial 1 -1 time1 value1 ... timeM valueM 32767 ; amplitude breakpoints for partial 2 -2 time1 value1 ... timeN valueN 32767 ; frequency breakpoints for partial 2 -2 time1 value1 .......... -2 time1 value1 .......... ; pairable tracks for partials 3 and 4 -1 time1 value1 .......... -1 time2 value1 ..........
hetro -s44100 -b.5 -d2.5 -h16 -M24000 audiofile.test adsynfile7
This will analyze 2.5 seconds of channel 1 of a file "audiofile.test", recorded at 44.1 kHz, beginning .5 seconds from the start, and place the result in a file "adsynfile7". We request just the first 16 harmonics of the sound, with 256 initial breakpoint values per amplitude or frequency track, and a peak summation amplitude of 24000. The fundamental is estimated to begin at 100 Hz. Amplitude thresholding is at 54 db down.
The Butterworth LPF is not enabled.
Here is an example of the hetro utility. It uses the file hetro.csd.
Example 1344. Example of the hetro utility.
See the sections Real-time Audio and Command Line Flags for more information on using command line flags.
<CsoundSynthesizer> <CsOptions> ; Select audio/midi flags here according to platform -odac -m0 --limiter=.95 ;;;realtime audio out, with limiter protection ; For Non-realtime ouput leave only the line below: ; -o hetro.wav -W ;;; for file output any platform </CsOptions> <CsInstruments> sr = 44100 ksmps = 32 nchnls = 2 0dbfs = 1 ; by Menno Knevel 2021 gilen filelen "fox.wav" ; get length of soundfile ; analyze sound file and output result to 3 hetro files ires1 system_i 1,{{ hetro fox.wav fox1.het }} ; default settings ires2 system_i 1,{{ hetro -f250 fox.wav fox2.het }} ; high starting frequency ires3 system_i 1,{{ hetro -f100 -h180 fox.wav fox3.het }}; up to 18kHz! instr 1 ; untreated signal asig diskin2 "fox.wav", 1 prints "\n---***YOU NOW HEAR THE UNTREATED SOUND SAMPLE***---\n" prints "---*duration of soundfile is %f seconds*---\\n",gilen outs asig, asig endin instr 2 prints "\n---***YOU NOW HEAR THE RESULT OF THIS ANALYZED FILE:***---\n" asig adsyn 1, 1, 1, p4 outs asig, asig endin </CsInstruments> <CsScore> i1 0 2.76 ; original sample i2 5 2.76 "fox1.het" ; whole sentence i2 10 2.76 "fox2.het" ; whole sentence, but analyzed with different settings i2 15 2.76 "fox3.het" ; whole sentence, and again analyzed with different settings e </CsScore> </CsoundSynthesizer>