God bless us with true love

Sunday, October 20, 2013

How To Record Vocal

Hi! I'm tangkk from Guangzhou, China. This lesson is for week 1 of Introduction to Music Production. I'm going to talk about how to record vocal.

Check list:

To record vocal, you need to have several things at hand: a Microphone, an XLR cable, an audio interface, a headphone, and a computer with audio recording software such as Adobe Audition or Audacity, or DAW such as Garageband or Cubase.

1. Microphone
First step is to choose your mic. Depending on what type of vocal you want to record, you may choose either dynamic mic or condenser mic. Choose dynamic mic if you want to record something of hard style such as rock, something that do not require every fine detail of the singing to be captured. Choose condenser mic if you're targeting at soft style music such as blues, pop or jazz. You should add a mic cover or wind protected shield in front of the mic to protect it from unpleasant noise generated by the singer such as hiss.

2. Audio Interface

Next, check your audio interface. Since signal from microphone is far below line level, it's very sensitive to noise. Thus mic signal should always be transmitted via XLR cable, which is balance. Make sure your audio interface has a XLR input port with input gain knob. The output of your audio interface should be a USB cable connect to your computer.

3. Cable

The XLR cables are not all the same. There are different qualities. Try to buy one with high quality. Whenever you can use a short cable, don't use a long one. The USB cable is also important since low quality USB may damage data integrity which leads to unexpected result. The rule of thumb is to always use the one provided with the audio interface, if not, use expensive ones.

4. Computer
Make sure the audio interface drive is properly installed in your computer. If you have your backing music already as a wav or mp3 file, you can use software like Audacity or Audition to record your vocal. If you're working on an music project, you can directly record the vocal part within your favorite DAW such as Cubase, Pro Tools, Logic, Reaper, etc. I recommend you install the ASIOv2 as well, since it reduces the delay from the vocal source to the computer.

5. Headphone
You also need to have a headphone to let the singer listen to the music while recording.

Before recording:

There are a couple of things you need to do before actually record.

1. Make the singer feel comfortable. To achieve this, you need to use a mic stand to place the mic at a suitable height. Don't put on the headphone while not recording since this will make the singer uncomfortable. You may also need to let the singer drink some water beforehand to ease his/her throat as well as the vocal cord a little bit so as to prevent unpleasant sound when recording.

2. Connection. That is to setup your work space. To do so, you first zero the input gain of the mic port, turn off the +48 phantom power if necessary, then turn off the audio interface; then plug in the mic using the XLR, turn on the audio interface, turn on the phantom power if you're using condenser mic, then turn up the input gain knob. Ask the singer to sing the song he/she wants to record, especially the loudest part, meanwhile you monitor the input level, adjust the input gain so that the loudest part of the singing is about -1dB to -3dB. Connect the headphone to the audio interface and pass it to the singer. Adjust the volume of the headphone. Then you're all set.

While recording:

While recording, it's your job to monitor everything including input level, output level and the overall quality of the recording. If either level goes beyond 0dB, you should consider adjust the input gain again. If at some point you notice the singer produces some unexpected sound or unpleasant sound, you could stop him/her and make him/her do one take again. You know, with those powerful recording software, you don't have to do everything from the top, but you can if you want something real natural and complete.

After recording:

When you finished recording, you should first disconnect the headphone by zeroing the output gain knob and unplug the jack. Then disconnect the mic by zeroing the input gain knob, switch off the phantom power if necessary and then unplug the jack. Then you can safely remove the audio interface from the computer. After that you can do whatever audio editing you like, such as compression, reverbation, EQ, normalization etc., in the DAW or recording studio software, before you finally export the audio mixdown.

This is the end of the lesson. I can't say I'm very good at vocal recording, but I have recorded about 30 of my own works. Summing up those experience I had, I found vocal recording depends greatly on the recording environment, the state of the singer as well as whether the song fits the expression range and style of the singer or not. If all these are good, the recording will almost always be successful, and the post-editing will be easy. Otherwise, there must be a lot of work in the editing stage and it still turns out not good enough. Thanks for your attention! I hope you like my lesson. Any feedback are welcome.

tangkk

Tuesday, September 10, 2013

Pop Music Machine

The problem is: How can we invent an algorithm to generate great pop music melody? This problem seems to be a generation problem, but in fact it imply a conditional factor which is "great", so it is actually two problems: a generation problem and an evaluation problem.

Is the evaluation problem a simple "accept/reject" problem? Seems not. Pop music melody seems hard to be divided as good or bad. They are usually divided into several levels: very bad, bad, OK, good, very good, extremely good, etc. But since our requirement is to generate "great" music melody, the evaluation algorithm can be decided to only accept the melody at the top level and reject all the rest. Then it becomes an "accept/reject" problem. The next problem is what to accept? That is, what defines great pop music melody in terms of computer language? Can this standard of "acceptable" be evolved and updated over time? This is a big problem we're going to dig deep into in this article.

Once we have the evaluation algorithm, the generation part is obvious. Actually we can randomly generate musical sequence and feed them to the evaluation module. But is this the only approach? If it is, all we have to solve is only the evaluation problem. But can we do it smarter, to make the system generate an acceptable melody within much less time? To put it to the extreme, can we generate different acceptable melody every time? This is also what we're going to talk about in this article.

to be continued...

Wednesday, June 26, 2013

RandomMelody

Here is the link for you to taste it:
https://googledrive.com/host/0B1i5x1YLBRrkbkdONnRhMmNCMTA/index.html

This project is called "RandomMelody". It generates random melody while you draw on the sketch board, and the resulting graphics can uniquely reflect the melody history and sometimes even "predict" the future melody. See if you can "predict" the future from the graphics without looking at the source code. When opened, wait for some time until all things are loaded, then you can start to play! Note that when you click the button it plays, when you release the button, it stop immediately and play another note. When you drag, see what happen.

tangkk

*************************************************************

It is done by processing. Here are some of the code:

https://docs.google.com/file/d/0B1i5x1YLBRrkTGQzT3JEYi1uNWM/edit?usp=sharingd

//The MIT License (MIT) - See Licence.txt for details

// Abstract: This is an app making logically random guitar/piano mix melody by simply drawing on the screen

Maxim maxim;
AudioPlayer [] Piano;
int rann1 = 0;
int rann2 = 0;

int randrag1 = 0;
int randrag2 = 0;
boolean haveplayed = false;

void setup()
{
size(768, 1024);
maxim = new Maxim(this);
Piano = loadAudio("Piano/Piano", ".wav", 22, maxim);
background(0);
rectMode(CENTER);

}

void draw()
{
noFill();
}

void mouseDragged()
{
// deal with the graphics
float red = map(mouseX, 0, width, 0, 255);
float blue = map(mouseY, 0, height, 0, 255);
float green = dist(mouseX,mouseY,width/2,height/2);

float speed = dist(pmouseX, pmouseY, mouseX, mouseY);
float alpha = map(speed, 0, 20, 7, 10);
float lineWidth = 1;

noStroke();
fill(0, alpha);
rect(width/2, height/2, width, height);

stroke(red, green, blue, 255);
strokeWeight(lineWidth);

float ran = random(1);
if(ran > 0.3)
brush1(mouseX, mouseY,speed, speed,lineWidth, Piano);
if((ran > 0.2) && (ran <= 0.3))
brush2(mouseX, mouseY,speed, speed,lineWidth, Piano);
if((ran > 0.03) && (ran <= 0.2))
brush3(mouseX, mouseY,speed, speed,lineWidth);
if(ran <= 0.03)
brush4(pmouseX, pmouseY,mouseX, mouseY,lineWidth);

if (haveplayed == false) {
randrag1 = (int)random(22);
if(random(1) < 0.1) {
Piano[randrag1].play();
haveplayed = true;
ellipse(mouseX,mouseY,randrag1*15,randrag1*15);
}
randrag2 = (int)random(22);
if(random(1) < 0.03) {
Piano[randrag2].play();
haveplayed = true;
ellipse(mouseX,mouseY,randrag2*15,randrag2*15);
}
}

}

void mousePressed()
{
Piano[rann2].stop();
rann1 = (int)random(22);
Piano[rann1].play();
ellipse(mouseX,mouseY,rann1*8,rann2*8);
}

void mouseReleased()
{

Piano[rann1].stop();
Piano[randrag1].stop();
Piano[randrag2].stop();
haveplayed = false;

rann2 = (int)random(22);
Piano[rann2].play();
ellipse(mouseX,mouseY,rann2*8,rann1*8);
}

//The MIT License (MIT) - See Licence.txt for details

void brush1(float x,float y, float px, float py, float lineWidth, AudioPlayer [] Piano) {
strokeWeight(lineWidth);
//ellipse(x,y,px,py);
line(x,y,0,0);
// line(x,y,width,0);
// line(x,y,0,height);
// line(x,y,width,height);

int pitchSelect;
int unit = height/21;
System.out.println("unit: " + unit);
pitchSelect = (int)(y/unit);
if(pitchSelect < 0)
pitchSelect = 0;
if(pitchSelect > 21)
pitchSelect = 21;
System.out.println("pitchSelect: " + pitchSelect);
//Piano[21-pitchSelect].play();

return;
}

void brush2(float x,float y, float px, float py, float lineWidth, AudioPlayer [] Piano) {
strokeWeight(lineWidth);
pushMatrix();
translate(x,y);
rotate(random(px));
rect(0+random(50),0+random(50),10,10);
popMatrix();

int pitchSelect = (int)(x+random(50))%22;
if(pitchSelect < 0)
pitchSelect = 0;
if(pitchSelect > 21)
pitchSelect = 21;
System.out.println("pitchSelect: " + pitchSelect);
if(random(1)>0.5)
//Piano[pitchSelect].play();
return;
}

void brush3(float x,float y, float px, float py, float lineWidth) {
strokeWeight(lineWidth);
pushMatrix();
translate(x,y);
rotate(random(px));
line(0+random(50),0+random(50),0,0);
rotate(random(px));
line(0+random(50),0+random(50),0,0);
rotate(random(px));
line(0+random(50),0+random(50),0,0);
popMatrix();
return;
}

void brush4(float x,float y, float px, float py, float lineWidth) {
strokeWeight(lineWidth);
//line(px,py,x,y);
triangle(px,py,width-x,height-y, width, height);
triangle(width/2+((width/2)-px),py,width-(width/2+((width/2)-x)),height-y, width, 0);
triangle(px,height/2+((height/2)-py),width-x,height-(height/2+((height/2)-y)), 0, height);
triangle(width/2+((width/2)-px),height/2+((height/2)-py),width-(width/2+((width/2)-x)),height-(height/2+((height/2)-y)), 0, 0);
return;
}

//The MIT License (MIT)

//Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
//The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

//THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

AudioPlayer [] loadAudio(String stub, String extension, int numAudios, Maxim maxim)
{
AudioPlayer [] Samples = new AudioPlayer[0];
for(int i =0; i < numAudios; i++)
{
AudioPlayer Sample = maxim.loadFile(stub+i+extension);
Sample.setLooping(false);
Sample.volume(0.5);
if(Sample != null)
{
Samples = (AudioPlayer [])append(Samples,Sample);
}
else
{
break;
}
}
return Samples;
}

Saturday, June 22, 2013

Take a photo of our feeling

Someday when I was walking in the campus, a fresh memory for no reason just come to my mind, reminding me of the days when I first stepping into this campus, when I was so excited about this brand new place, etc and etc. At that very moment, which just lasted for a few seconds, I was so happy, and was like this was really the first time I came to this place.

Sometimes we take photos to try to catch those treasurable moments in our lives. Although I still take photos these days, I seldom take a look at those photos. For me, most of the times those photos only remind us with what happened, but they don't really remind us of how we actually feel at those moments, like what I described above. It would be much more valuable if our feeling can also be freezed somewhere in the digital world, and we can access them whenever we want.

At least I can think of one good thing about this freezing. A couple live together for many years may lost their sense of favour, or love, towards each other. Sometimes couples try hard to grasp the memory of their first met or first kiss just to remember themselves how much they like each other at the beginning, but they failed. They took a lot of photos to remember those happy hours, but some of them still ended up divorces. Try to think of if feeling can be captured, stored, and instantly accessed, there will be no more difficulty for couples to recall their feeling how much they love each other. Would that be great?

Google Translate 谷歌翻译：

有一天，当我走在校园里，没有理由记忆犹新刚刚来到我的脑海里，提醒我的日子，当我第一次踏进这个校园，当我是如此兴奋，这个全新的地方，等等那一刻，这只是持续了几秒钟，我太高兴了，很喜欢这真的是我第一次来到这个地方。

有时我们也试图抓住那些智胜宝的时刻，在我们的生活中的照片。虽然我还是拍照，这些天，我很少来看看那些照片。对于我来说，大部分时间这些照片提醒我们发生了什么事，但他们真的不提醒我们，我们实际上是如何在那些时刻的感觉，就像我上面描述。如果我们的感觉还可以，冻结在数字世界的某个地方，这将是更有价值的，只要我们想，我们可以访问它们。

至少，我能想到的关于这个冻结的一件好事。一对夫妇在一起生活了很多年，可能会失去他们的青睐，或爱的感觉，向对方。有时夫妇都很难把握的记忆他们第一次见面或第一次接吻只是记得自己多少，他们在开始喜欢对方，但他们失败了。他们拍了很多的照片，还记得那些欢乐时光，但一些人仍然结束了离婚。试想想，如果感觉可以被捕获，存储和即时访问，将不会有更多的困难，夫妇回忆起他们的感觉，他们彼此相爱多少。将是伟大的呢？

tangkk

Wednesday, June 19, 2013

A platform for people to jam music in small public space

Being in a public space, such as a canteen or a plaza, you'll hear some music coming out of centralised speakers. Most of these music are of a soft style. They tend to be slow and smooth, without vocal part. Sometimes you feel like the background atmosphere could be made better. Or sometimes, you'd just like to show up your existence, send a message or to do something for fun. Or in some other times, you feel boring and just would like to find something to do. Then maybe you can try to jam in the background music to make something interesting happen.

How about we invent a social network jamming platform(actually this has already been implemented by my team) fulfilling this need. Let's call it "WIJAM", meaning, "we instantly jam together over WIFI". The basic idea is easy: someone takes out his/her cell phone, opens an app or something, starts to create some melody, and the melody created is instantly mixed and broadcasted with the original background music and other melodies from other guys via a central speaker system.

But there are many problems underlying this scenario, and these problems lead to deeper research under the topic. The first problem naturally pop out is: do they really know how to jam? A huge problem. Assuming some of them are musical novices, they may have some very good music ideas in mind, but don't know how to express on traditional instrument layout such as a keyboard. OK, maybe we provide them with a fix musical scale, say a pentatonic scale or Ionian scale, so that they are within the scale. Then what about the key? Some simple background music use only a fix key without modulation. Even in with such background music, the novices still need to choose a key to apply the scale. How is it possible? Not to mention those grooves that change keys or even change available scales. No, we cannot let users to determine such a lot of things, we should leave them as comfortable and enjoyable as possible. This leads to the idea of a master controlling system to instantly assign keys and scales to all the users. Note that the final outcome of the jamming is playbacked via central speaker, this master system is also in charge of collecting the performance from players and distribute them to the speakers.

Yet this is far from the end of the story. As you know, the touch screen of most mobile phone is relatively small. If you ever had an experience playing a mobile phone keyboard, you probably think of it a bad experience because the keys on the screen are too small to touch. Back to our scenario, what should the master actually assign to the users? A real piano keyboard contains 88 notes, containing approximately 7 octaves. The number of all the notes of a C Ionian scale(which contains 7 identical pitch class) is approximately 7*7 = 49. But obviously we cannot afford all of the 49 notes on a single cell phone screen. Note that most of the time piano players only focus on 2-3 octaves in the middle of the keyboard layout, namely they seldom go to the very low pitch or very high pitch part. Also note that our scenario has background music already, which probably contain the bass part. So we can safely omit 3 lowest octaves and 2 highest octaves, keeping about 2-3 octaves, which still have 14-21 notes. This amount of notes is good enough for normal expression. But then how should we place these notes within the touch screen?

Actually this is one of the crucial issues that determine the successfulness of this application. It should be admitted that up till now no final decision is made upon this issue. One simple way we adopted at the beginning is to evenly dividing the screen into 4-row 2-column. With this the user has 8 note to express at a time. The master has several 8-note patterns to be ready to assign to the users and each of these pattern is within a certain scale such as Ionian or Dorian. The advantage of this layout is every note has a relatively large touchable area, and the novice user's expression freedom is under controlled to be within a scaled 8-note. But the disadvantages seems outweigh the advantages. For those whose want more notes, they can't get it until the master assign a new pattern. For normal users, they cannot differentiate the notes until they play and listen. They'll soon feel boring when they realize it's non intuitive to control their own expression. There are two facets in the drawbacks described above: 1. the expert users want more freedom, they need detailed control panel; 2. normal users want more control over their expression, they need intuitive control panel.

To cater for both users, the system design rule "leave it for the user" need to be applied, namely, two layouts should be designed and let users choose their preference. For the expert users, a note-based layout should be designed. It contains 16 - 32 notes, and should be layout in a systematically way that is easy to start with and difficult to be virtuoso at the same time. For the novice users, a graphic-based, or drawing-based layout should be preferred. It maps the drawing to up and down of the melody line, which corresponds to the feeling expressed by the users. The algorithm will determine which pitch to use according to a certain chord-scale combination provided by the master. If you are keen enough, you may notice that there is an issue regarding the drawing-based interface: how can the user express "rhythm"?

One approach is to use finger motion to signal a note-on of the new note and note-off of the old note, where the new note is mapped to the new position of the finger. For instruments such as guitar and piano we don't even need to signal the note-off most of the time, and it will be signaled automatically after the note finish decaying. This seems a workable approach, which we haven't implemented yet. Another approach is to use one finger to signal the rhythm, another finger to draw the melody. This is also interesting and quite convenient indeed.

So much for the interface issue. What about the overall performance outcome of this system. How to make all the performances by an ad-hoc group of scattering experts or novices make sense, fit together, or at least, sound good. This problem is two-folded. The easier and more fundamental one is how to make the mixture sounds good, while the more difficult one is to let it makes sense.

So how? To tackle the "sound good" problem is relatively easy. It demands something called algorithmic mixing and mastering. Theoretically(I do not have any source at this statement), the sum of any number of channels of any sound can be made comfortable to human ear as long as it is well mixed and mastered, regardless of the underling musical structure(such as, chord progression) or whatever. Namely, we can always manage to make it sounds comfortable. But the problem is how. As this has not yet been implemented, so it cannot be told for the moment. But let's make some guess. Say for example a very simple algorithm would be to set the volume of every channel to 1/n, where n is the number of channel. This makes sense, but not an ideal solution, as you may argue that what if some of the channel have higher weight. So here comes the problem, how to determine which channel has a higher weight? One approach is again to leave it to the user, but since we assume most of the users are musical novice, it may not work as wished. Note that our scenario is public space jamming, which is very different from an on stage live performance. On stage performance yields a sense of being focused, while our scenario yields a sense of scattering, in which every one enjoy themselves being hidden instead of being watched. So the "weight" doesn't convey the same meaning as before. Interestingly, people participating in this jamming will probably also wish their performance be heard by all other people. By combining these two observations, we can safely derive a bottom line of the auto mixing and mastering mechanism, which is to let every player at least be able to hear their part of contribution from time to time. By this key finding, we can safely write an algorithm involving some randomness to implement this function. Not a big deal yet.

The big deal is how to make the outcome make sense. To make sense, a logical expression of music is needed. It's more than to comfort our ear, but to comfort our mind. Since the basic assumption is the participators mostly are musical novice, this problem becomes an algorithmic composition problem. It could be a classical algorithmic composition problem when the musical content involved are pitches and timbres of standard instruments, or it could also be a new algorithmic composition problem when more sound synthesis elements are involved. Again this could be divided into two sub-problems. To tackle the classical one, we should attempt a more strict algorithmic composition approach, which contains a lot of AI stuff and far from fully developed yet. The algorithmic composition techniques can be found in lots of literature. While a "modern" algorithmic composition" problem, which aims to output some modern musical style such as electroacoustic, is relatively easier as I see, because the aesthetics of these are much more subjective than those music within the range of classical music theory. And for implementing the algorithm, there are several ways, one way is to implement the algorithm on user's side, therefore perhaps every user when jamming is within their own algorithmic logic. But this has disadvantage as you may easily notice is that what if their "logic" collide with each other, so that the outcome of the jamming is not so pleasant? True! So we have another approach. In this approach, the algorithm is implemented within the master side, therefore master can cooperate all the users to create a piece of good music. If implementing like this, the master is actually an algorithmic composer and conductor as well. A third approach would be both master and users implement the algorithms, while there is a feedback channel from master to the user, with the feedback indicating the user's algorithm to make a certain change to adjust itself to the whole performance. I think the third approach is the optimal one. Talking more about this topic is out of my range of ability for the moment, so I'd better stop talking here.

And there are still other issues. The audio engine, for now this project uses AUPreset +AudioUnit+AVSesssion to make sound, the AUPreset file points to the prerecored instrument samples by Apple Garageband. I don't know whether this will be legal or not(but according to the official statement it seems legal). Anyway, a more interesting and challenging approach would be to try to use the mobile STK, mobile csound, or ChucK for sound synthesis. Hopefully with one of these engine the size of the app can be greatly reduced, and the app is filled with some of the most advanced stuff as well. Another issue is whether to use OSC instead of MIDI as the music performance transportation media, since it seems OSC has a lower network transport latency. We'll see. A third issue, which is a big one, is the evaluation issue, and I'm gonna use the next two paragraphs to discuss it.

This kind of work, if ever published to the academic journals or papers, will definitely confront a problem of evaluation. In other research areas, such as computer architecture, the evaluation can be done quite straight forward. There will be indexes indicating whether an architecture(or a certain architecture improvement) is good or not, the most widely used ones being "performance", which indicates how fast a system runs. While in the field of computer music, the evaluation problem becomes much more ambiguous. How to determine whether a computer music system is good or bad? To narrow down the range of discussion, in what sense can we say that a network collaborative system is good. One approach that can be found in many papers is to "let the audience judge". In these papers, the "feedback" of the audience is presented, some of them being the subjective feeling, some of them being advises, some of them may also be questionnaires. Similarly, in a paper I saw the evaluation method is to post the outcomes of the computer music system on the web and let viewers rate them. Besides, there is "let the participants judge" method. The logic behind these two mentioned evaluation methods are all quite natural and obvious, since music is a aesthetic process, the final judgement should be made by human being.

But there is still another approach, which should be called "machine appreciation". This method should be implemented with "machine listening " in a very high level sense. Ordinary machine listening only cares about the audio material and the structure behind the audio material, while machine appreciation should demand a higher level of machine listening which cares about the musical material and also the musical structure. Of course every thing can be scaled down to a simplest case. In the case of machine appreciation in the context of classical music structure expression, the simplest algorithm only needs to take care of whether the notes are within scale or whether the voice leading obeys the rules. But of course, as you will agree, this is far from enough. Whether machine is able to appreciate music is itself also a big big question to be answered.

Nevertheless, we can use machine to do some measurement, such as whether by doing such and such the users are becoming more active, or whether such and such can make a collaborative system being more responsive. These are something machine can absolutely do.

So much for the evaluation part. Now I guess I've already given an introduction to this collaborative music jamming application in small public space. I've discussed the big scenario, the role of master and users, the user interface issues, the output quality issue as well as the evaluation issue. With all of these, I guess an extremely great public space jamming application can be created! Hopefully it can be done soon!

If you are a "master" and looking for free jam along tracks or backing tracks, here are some great jam-along tracks:
http://www.wikiloops.com

to be continued...

tangkk

作为一个公共空间，如食堂或广场，你会听到一些音乐集中扬声器。大多数这些音乐的柔美风格。他们往往是缓慢和平稳，没有人声部分。有时候你会觉得像可以作出更好的背景气氛。或者有时候，你只是想显示你的存在，发送信息，或者做一些乐趣。或者在另外一些时候，你觉得无聊，只是想找到事做。那么也许你可以尝试在背景音乐的果酱，做一些有趣的事情发生。

怎么样，我们发明了一种社会网络干扰平台（实际上，这已经实现我的团队），满足这方面的需要。让我们叫它“WIJAM”，意思是“我们立即果酱一起通过WiFi”。基本的想法很简单：有人需要他/她的手机，打开一个应用程序或东西，开始创建一些旋律，和创建的旋律瞬间混合和原来的背景音乐和其他的旋律从其他人通过中央广播扬声器系统。

但也有许多相关问题，这种情况下，这些问题导致下深入研究的话题。自然蹦出来的第一个问题是：他们是否真的知道如何果酱？一个巨大的问题。假设其中一些音乐的新手，他们可能会在心中有一些很不错的音乐理念，但不知道怎样来表达对传统乐器，如键盘布局。好吧，也许我们为他们提供一个固定的音阶，说五声音阶或爱奥尼亚规模的，使它们的规模内。那么什么关键？一些简单的背景音乐只能使用一个固定的无调制的关键。即使在这样的背景音乐，新手还需要选择一个关键应用的规模。怎么可能？更不用提那些沟槽改变键，甚至改变现有的尺度。没有，我们不能让用户来确定这样的东西很多，我们应该让他们尽可能舒适和愉快的。这导致一个主控制系统瞬间指定键和扩展到所有用户的想法。此主站系统注意抗干扰playbacked通过中央扬声器的最终结果，是负责收集从球员的表现，并分发到扬声器。

然而，这是从故事的结局。正如你所知道的，大多数手机的触摸屏比较小。如果你曾经有过的经验，打手机键盘，你可能会想到一个坏的经验，因为在屏幕上的按键太小，无法触摸。回到我们的场景，主实际上分配给用户？一个真正的钢琴键盘包含88笔记，约含7个八度。所有的音符一个C爱奥尼亚规模（其中包含7个相同间距类）约7 * 7 = 49。但很明显，我们不能负担所有的49个票据在一个单一的手机屏幕。请注意，大部分的时间钢琴玩家只专注于2-3个八度的键盘布局中，即他们很少去非常低的沥青或非常高音部分。另外请注意，我们的场景已经有背景音乐，这可能包含低音部分。因此，我们可以放心地忽略3个八度最低和2个八度最高，保持约2-3个八度，哪还有14-21音符。此票据金额是不够好，正常的表达。但后来，我们应该如何将这些笔记内的触摸屏？

其实，这是确定这个应用得失的关键问题之一。应当承认，直至后，这个问题现在还没有作出最后决定。我们在开始时通过一个简单的办法是均匀分割屏幕分为4行2列。与此用户有8注意表达一次。 8音符主有几个模式，准备分配给用户和这些图案是在一定的规模，如爱奥尼亚海或多利安。这种布局的优点是每一个音符，有一个比较大的可触摸区域，和新手用户的表达自由是在控制范围内的比例8音符。但缺点似乎远大于优点。对于那些希望更多的音符，他们不能得到它，直到主机分配一个新的模式。对于普通用户来说，他们不能区分票据，直到他们播放和收听。他们很快就会感到无聊的时候，他们意识到这是不直观的控制自己的表达。在上面描述的缺点有两个方面：1。专家用户想要更多的自由，他们需要详细的控制面板; 2。普通用户想他们表达的更多的控制权，他们需要直观的控制面板。

为了配合这两个用户，系统设计规则“离开它为用户”需要被应用，即，两个布局应设计，让用户选择自己的喜好。专家用户，注意布局应设计。它包含16 - 32个音符，应该有系统的方式，是容易下手，不易被演奏家同时布局。对新手来说，一个基于图形或绘图，基于布局应该是首选。它映射图中的向上和向下的旋律线，对应于用户所表达的感觉。该算法将确定，间距根据由主机提供特定和弦规模组合使用。如果你是热衷的话，你可能会注意到，有问题基于绘图接口：用户如何快车“节奏”？

一种方法是用手指运动信号注意对新的音符，音符停止旧的注意，新的便笺被映射到新的位置的手指。乐器，如吉他和钢琴，我们不甚至需要注意关闭大部分时间信号，并，说明完成腐烂后，它会自动发出信号。这似乎是一个可行的办法，这是我们尚未实现。另一种方法是用一个手指到信号的节奏，另一个手指绘制旋律。这也是有趣，确实挺方便的。

这么多的接口问题。关于这个系统的整体性能结果。如何使所有的表演特设小组的专家或新手散射是有意义的，结合在一起，或者至少，声音好。此问题是叠成两折的。更容易和更根本的是如何使混合听起来不错，而它是有道理的，更困难的是让。

因此，如何？为了解决“声音好”的问题是比较容易的。它要求的东西称为算法的混音和母带制作。（我没有任何来源在本声明）从理论上讲，任何声音的任何数目的信道的总和可以只要人耳舒适，因为它是很好的混合和掌握，无论下属的音乐结构（例如，和弦进行）或什么的。也就是说，我们总能设法使这听起来很舒服。但问题是如何。随着这种情况尚未付诸实施，所以它不能被告知的时刻。但是让我们做出一些猜测。说，例如一个很简单的算法，将每个通道设置音量的1 / n，其中n是通道号。这是有道理的，但不是一个理想的解决方案，因为你可能会争辩说，如果一些通道具有更高的权重。所以这里问题来了，如何确定哪个通道有一个较高的权重？一种方法是再次离开给用户，但因为我们假设大多数用户都是音乐的新手，希望可能无法正常工作。请注意，我们的场景是公共空间的干扰，这是非常不同的舞台上现场表演。在舞台上的表现产生了一个被聚焦的感觉，而我们的方案中产生散射，每一个享受自己被隐藏，而不是被监视感。因此，“减肥”不传达以前相同的含义。有趣的是，人们参与这种干扰可能会也希望所有其他人听到他们的表现。通过这两个观测相结合，我们可以放心地得出一个底线的自动混音和母带的机制，这是为了让每个球员都至少能不时听到他们的贡献的一部分。这个关键的发现，我们可以放心地写一个算法涉及一些随机性来实现这个功能。没什么大不了的。

大不了就是如何使结果有意义。为了便于理解，一个逻辑表达式的音乐是必要的。更重要的是，比安慰我们的耳朵，而是要安慰我们的心灵。由于参与者大多是音乐新手的基本假设是，这个问题变成一个算法的组合问题。这可能是一个经典算法的组合问题时所涉及的音乐内容的音高和音色的标准仪器，或涉及更多的声音合成元素时，它也可能是一个新的算法组合问题。同样，这可以分为两个子问题。为了解决经典，我们应该尝试更严格的算法的组合方法，其中包含了很多AI的东西，远未得到充分开发。算法作曲技法，可以发现大量的文献。虽然“现代”的算法作曲“的问题，其目的是输出一些现代的音乐风格，如电，相对容易些，因为我看到的，因为这些都是比音乐更主观的美学古典音乐理论的范围内。为实现该算法，有几种方式，一种方式是，以实现该算法在用户侧，因此，或许是每一个用户时，干扰是在自己的算法逻辑，但是这作为您可以轻松地发现缺点是，如果他们的“逻辑“相互碰撞，使堵塞的结果是不那么愉快吗？真！因此，我们有另一种方法，在这种方法中，该算法实现在主人身边，因此主人可以合作的所有用户创建了一块如果实施这样的，主是实际上是一个算法的作曲家和指挥家，以及第三个方法是将主机和用户实现算法，同时有一个从主机到用户的反馈渠道，反馈指示的好音乐。用户的算法做出了一定的变化自行调节整机性能，我认为第三种方法是最佳的。更多地谈论这个话题是我能力范围的时刻，所以我最好停止在这里谈论。

还有其他问题。音频引擎，现在这个项目，使用AUPreset + AudioUnit + AVSesssion使声，AUPreset的文件点的prerecored苹果的GarageBand乐器样本。我不知道这是否会是合法还是不合法（但根据官方的说法，似乎是合法的）。无论如何，一个更有趣和具有挑战性的做法是尝试使用移动STK移动与csound，或声音合成夹头。希望与这些发动机的应用程序的大小可以大大降低，并且与最先进的东西，以及一些填充该应用。另一个问题是，是否使用OSC代替MIDI音乐表演运输介质，因为它似乎OSC具有较低的网络传输延迟。让我们拭目以待。第三个问题，这是一个大的，是评价的问题，我会用接下来的两个段落讨论。

这样的工作，如果曾经出版的学术期刊或论文，肯定会面对一个问题的评价。评价等研究领域，如计算机体系结构，可以做得相当直截了当。会有指标架构（或某一个架构改进）是否是好还是不好，使用最广泛的是“业绩”，这表明一个系统运行的速度有多快。在电脑音乐领域，评价问题变得更不明确的。如何判断一个电脑音乐系统是否是好还是坏？为了缩小讨论的范围，我们在什么意义上可以说，网络协作系统是好的。一种方法，可以发现在许多论文是“让观众法官”。在这些文章中，“反馈”的观众，其中一些人的主观感觉，他们中的一些建议，也可能是他们中的一些问卷调查。同样，在一篇论文中，我看到的评价方法是电脑音乐系统成果发布在网站上，让观众率他们。此外，还有“让参与者法官”的方法。提到这两个评价方法背后的逻辑是很自然的和显而易见的，因为音乐是一种审美的过程，最后的判决应该由人类。

但是，仍然有另一种方法，这应该被称为“机鉴赏”。此方法应该在实现“听机”在一个非常高的层次感。普通机听只在乎背后的音频材料的音频材料和结构，同时该机的升值应该要求一个更高层次的机器听音乐素材的音乐结构关心。当然，每一种东西可以缩小到一个最简单的例子。在机器升值的背景下，古典音乐的结构表达的情况下，最简单的算法，只需要照顾票据是否是在规模或语音是否服从规则。不过，当然，你会同意，这是远远不够的。无论机是能够欣赏音乐，本身也是一个大的大要回答的问题。

不过，我们可以用机器做一些测量，如无论是通过做这样那样的用户正变得更加活跃，或者是否可以这样和那样的一个协作的系统，反应更灵敏。这些都是机器绝对可以做的东西。

这么多的评估的一部分。现在，我想我已经给介绍这种合作的音乐抗干扰应用在小型公共空间。我已经讨论过大的情况下，主站和用户的角色，用户界面问题，输出质量问题以及评价问题。所有这些，我想极大的公共空间干扰可以创建应用程序！希望可以很快完成！

未完待续...

Thursday, June 6, 2013

Pop song melody as a hierarchical sequence

Consider a piece of pop song melody. You may say it is a sequence of pitches. But if I simply randomly put all those pitches into a MIDI sequence and playbacked by machine, it may probably become a dead sequence. Namely, it does not convey any logically meaning.

So the melody does not contain only pitches. Pitches are the most direct phenomenon we got, but it's more than pitches. To simply extend below, pitches are modulated by the rhythms. Even considering a simple 4/4 time bar with 6 notes inside, there are infinite possibilities of rhythms. Each of them express a different musical feeling. So we take also the rhythm into our previous MIDI sequence, it sounds more logical, meaningful,but somehow still feels dead.

Something is still missing. When a singer sings, she not only sings out pitches according to the rhythms, but also expresses her highs and lows at the same time. These emotional highs and lows can be related to the pitches' highs and lows, but not exactly the same. The emotional feelings adds another layer above the pitches, which is called "dynamics". Dynamics can be simply understood as the loudness or velocity of a note. As you know, we can still do an experiment of dynamics within our MIDI sequence, but, as you may guess, it's still not perfect.

Now we have pitch, rhythm and dynamics, and what else?

Most of the time a singer sings each pitch at a time, but sometimes she drags a pitch here and there. And still some other times, she put an unnoticeable slightly lower or higher pitch slightly before the pitch she's going to sing. By such, the singer creates a smooth flow of sound. This is called articulation. Can we do articulation in MIDI sequence? Yes, but not as easy as the previous three. Simply because there is no simple way to encode articulation. Our effort to emulate a beautiful melody using MIDI sequence can safely stop here. But anyway, we discover the 4th layer of a pop song melody.

So pop song melody as a hierarchical sequence, from bottom to top, as I see, is: 1. Rhythm; 2. Pitch; 3. articulation; 4. Dynamics.

tangkk

******************
Google Translate:
谷歌翻译：
******************

考虑流行歌曲旋律一块。你可能会说，它是一个序列的球场。但如果我只是随机把所有这些球场成MIDI音序和机playbacked的，它可能成为一个死的序列。也就是说，它不传达任何逻辑上的意义。

因此，旋律不包含唯一的球场。球场是我们得到了最直接的现象，但它是多球场。下面简单的扩展，球场的节奏调制。即使考虑一个简单的4/4时间栏里面有6笔记，有节奏的无限可能性。他们每个人都表达了不同的音乐感觉。所以我们又到我们以前的MIDI音序的节奏，它听起来更符合逻辑的，有意义的，但不知何故，仍然觉得死了。

事情是人仍下落不明。当一个歌手唱歌，她不仅唱了球场的节奏，但也同时表示她的高点和低点。这些情绪的高点和低点可以与球场的高点和低点，但并不完全相同。的情感又增加了一层以上的球场，这就是所谓的“动态”。动态可以简单地理解为一个音符的响度或速度。正如你所知道的，我们仍然可以做一个实验动力学在我们的MIDI序列，但是，您可能已经猜到，它仍然是不完美的。

现在，我们有音高，节奏和力度，还有什么？

大多数时候，一个歌手唱的各个节点的时间，但有时她拉着一音高在这里和那里。而且还有另外一些时候，她把一个不引人注意的略低或较高的音调稍前她要唱歌的音调。通过这样，歌手创建一个畅通的声音。这就是所谓的衔接。我们可以做MIDI音序衔接？是的，但不容易，因为前三个。很简单，因为有没有简单的方法来进行编码的衔接。我们的努力，来模拟一个优美的旋律，使用MIDI序列可以安全地停在这里。但无论如何，我们发现一首流行歌曲旋律的第四层。

所以流行歌曲的旋律，从底部到顶部，我看到的，作为一个层次序列是：1。节奏2。间距; 3。衔接; 4。动力学分析。

tangkk

Wednesday, June 5, 2013

On writting a simple mobile musical application

This article does not provide you with details on how to write such app, but on high level decisions and ideas on such issue.

First you need to decide what kind of musical application you want to write. The most common musical application would be an audio player. So let's talk about audio player first. For writing an audio player, you don't need to worry about MIDI or sequencer, all you need to consider is the playback mechanism. Say if you're gonna write such thing on Android, the first thing you need to do is search the keywords "audio", "playback", "audio playback" on the official website of Android development. Alternatively, Google is always there for you to gather essential information. Try to google "Android, audio, playback", and the useful stuff is there!

One thing really important is the sample code. If you got some sample code similar to what you want to achieve, then you're almost done. Or, if the sample code contains some very critical functions that is difficult to implement, you're very lucky because you don't need to worry about those functions any more. With the help of sample code, you don't have to start everything from scratch. What's more, sample code also trains your style of coding, to make you become a better programmer. Therefore I advise you to also search for one more keyword: "sample code".

The same things holds true for writing other applications such as those involve MIDI, such as a mobile piano. Since it involves MIDI manipulation, you need to know how MIDI is manipulated within the given mobile platform. Say if you're gonna write a simple piano for iOS. So you should google: "iOS, MIDI", or you can go to iOS developer library and search "MIDI". Don't forget about the "sample code". Except for MIDI, such application also involves a so called "virtual instrument". On iOS, this can be implemented using a type of file called "AUPreset". So search for this! Through some iterations, you can eventually arrive the same destination as all others did.

Usually, you can gain a clear picture of how to implement the functionality of the application relatively quickly. Then you sit down and write the code, use a week or two to debug. And then you realize that the real problem is not about the functionality, but the user interface design! Trust me! Until then you realize value of those design people!

tangkk

***************
Google Translate:
谷歌翻译：
***************

本文不提供您如何编写这样的应用程序，但高层次的决策和想法等问题。

首先，你需要决定你想要写什么样的音乐应用。最常见的音乐应用程序将是一个音频播放器。因此，让我们先说说音频播放器。为了写一个音频播放器，你不需要担心MIDI或音序器，所有你需要考虑的是播放装置。说，如果你会写这样的东西在Android上，你需要做的第一件事是搜索关键字“音频”，“播放”，“音频播放”Android开发的官方网站上。另外，谷歌总是在那里为你收集必要的信息。尝试谷歌的“Android，音频播放，和有用的东西是存在的！

真的很重要的一件事是示例代码。如果你得到了你想要实现一些类似的示例代码，那么你几乎可以做。或者，如果示例代码包含了一些非常关键的功能，是难以实现的，你很幸运，因为你不必担心这些功能。示例代码的帮助下，你不必一切从头开始。更重要的是，示例代码还训练你的风格编码，让你成为一个更好的程序员。因此，我建议你搜索一个关键字：“示例代码”。

成立同样的东西用于记录其他应用，如那些涉及MIDI，诸如移动钢琴。因为它涉及到MIDI操作，你需要知道的，MIDI是如何在给定的移动平台操纵。说，如果你要去写一个简单的钢琴为iOS。所以，你应该谷歌：“iOS上，MIDI”，或者你可以去iOS开发库和搜索的“MIDI”。不要忘了“示例代码”。除了MIDI等应用也涉及到所谓的“虚拟仪器”。在iOS上，这是可以使用的文件类型称为“AUPreset”。所以搜索！您可以通过某些迭代，最终到达相同的目的地别人做。

通常，你可以清楚地了解如何实现该功能的应用程序相对迅速。然后，你坐下，写得了代码，使用一个或两个星期调试。然后你意识到，真正的问题是没有的功能，但用户界面的设计！相信我！在那之前，你意识到这些设计的人的价值！

tangkk

God bless us with true love

Sunday, October 20, 2013

How To Record Vocal

How To Record Vocal

Check list:

Before recording:

While recording:

After recording:

Tuesday, September 10, 2013

Pop Music Machine

Wednesday, June 26, 2013

RandomMelody

Saturday, June 22, 2013

Take a photo of our feeling

Wednesday, June 19, 2013

A platform for people to jam music in small public space

Thursday, June 6, 2013

Pop song melody as a hierarchical sequence

Wednesday, June 5, 2013

On writting a simple mobile musical application

My Sina Weibo

My Google Play Store