RSS

Tag Archives: C

Reliable Asterisk CDRs

This is going to be a long and technical article pertaining to the capture of CDRs (call detail record) using the custom extension file and an AGI script.

I designed, built, deployed, and maintain a number of asterisk servers which are used as part of a VOIP arbitrage system.

The first generation system, which inherited, was a single system that housed the Asterisk server, database server for CDR and other billing and reporting data, and a PHP webapp that reported on the data. The system worked (a) when the volume was low; (b) when the overall amount of data was low. Needless to say I was brought in when the “system” (1) started hanging (2) losing call detail records (3) webapp could not return result before the browser would timeout. It was a mess.

The new system uses multiple systems (n+1). There can be a limited number of asterisk servers connected to a dashboard. The dashboard is where all the data is stored, where the ETL is performed, and where the reporting is initiated. The following design supports about 5000 channels on fairly moderate hardware; will auto-restart/recover if there is a crash; and will operate independently from the dashboard.

Here’s how it works. In the VOIP reporting business we live and die by the accuracy of the CDR. In VOIP there is no association equivalent of Visa or MasterCard that sets the rules or arbitrates the discrepancies. It’s always going to be “on you”. Therefore it’s always important to get the data from the switch. RDBMS people like to call the transaction ACID. Something very similar applies here.

In the basic Asterisk installation there are a number of ways to get the CDRs from the system. You can export them directly into flat files or directly into one of several brands of SQL databases like SQLite3. The problem with this approach is that the database is expensive in terms of resources and the flat file is inefficient because it’s one big file. This is additionally cumbersome when you’re trying to report and monitor in realtime.

My strategy is twofold. (1) Export the CDRs to a small flat file and change the flat file once a minute. (2) Then send the flat file to the dashboard server for processing. This is surprisingly efficient and it allows the system to continue to process calls if the dashboard is rebooting or in maintenance mode.

While approach has been wildly successful there is still some for improvement. The first improvement went live today.

Today’s challenge: When Asterisk receives an incoming call it authenticates the source and then tries to locate a route for the call (or the destination).  The routing of the call takes place in a file called extensions_custom.conf. In this file you’ll see some “code” that is more of a macro or script then an actual programming language. This macro tells asterisk when to do with the incoming call and at the end of the call to hangup. There are some other more complex functions like interactive voice prompts and voicemail but we’re just interested in routing. When the call completes we have to initiate a “hangup” and then we need to record the CDR.

So based on the approach above when the call was terminated (hangup) control would be passed to a 3rd layer script (through the AGI interface). This script could be written in any language and it would collect all of the data from the call and append the CDR to the flat file.

So let’s review:

  1. call initiated
  2. authenticate the call source
  3. check the extensions config for an appropriate route
  4. when the call is complete hangup
  5. send the CDR to a PHP script through the AGI interface

Step #5 looks like this:

exten => _X.,n,Hangup
exten => h,1,Set(CDR(userfield)=Hangupcause:${HANGUPCAUSE} Qos:${RTPAUDIOQOS})
exten => h, n, AGI(cdr_new.php, ${SIPCALLID}, ${CDR(dcontext)}, ${SUPPLIER},${CDR(start)}, ${CDR(duration)}, ${CDR(billsec)}, ${CDR(disposition)}, ${HANGUPCAUSE}, ${V_NETWORK}, ${CDR(lastapp)}, ${DEST})

The module that was replaced was “cdr_new.php”. The new module took the same parameters and was called “cdr_pub”.

The problem with the original PHP code was that it processed the incoming data and then created the target filename… and then opened the target for in order to append a record. It’s been working great but we are to a point where we might be losing some CDRs. (this is not definitive, just intuition) With 5000 channels running that means that there can be as many as 5000 instances of the routing process. That means when 5000 calls terminate at once there is a rush to append their CDRs. It’s simply not efficient for PHP to block when appending to the file. Not to mention that there is a lot of overhead for the PHP interpreter to load with each call completion.

The performance issues:

  • the latency to load php with each call completion
  • the possible deadlocks when more than one process tries to append to the same file at the same time. Blocking and resolution are not guaranteed.

The new plan. I rewrote the PHP script in C. Even with the few libraries I needed it’s not more than 20 or 30K. Since it’s native C it loads very fast. So this program gets all of the data from the AGI in the form of command line parameters and data in the STDIN. Then, instead of rushing to append the data to a file the small program sends or “publishes” the CDR to a redis pub/sub queue. There is a single, external, application that “subscribed” to the redis queue an when a message event arrives that external app will write the CDR to the flat file. Since there is only one external app appending to the flat file it cannot have the same problems.

One side note. If the publisher fails then the message event is posted to the syslog. And if the subscriber fails to append to the flat file then it also posts an event onto syslog. If something goes horribly wrong (with the exception of disk space) then we should have a chance to replay the calls in the dashboard by scrubbing the syslog file.

PS: once side note. This configuration also limits the number of simultaneous channels. Therefore if the CDR recording process blocks of any reason that will prevent the system from accepting the next call when the system is running at capacity for that source.

PS: the subscribe app was written in ruby. Installing ruby on my production asterisk server was not my first choice but it was worth it. The Ruby code was compact and it handled exceptions nicely. There were some idioms that I liked a little more than python. And while some of the development took place in Ruby 1.9.3 and the default version on the server was 1.8.7 I did have some challenges getting it to run and I needed to install some additional packages…… which as a side note confirms all of my previous beliefs about full stack awareness.

PS: One last note. When deciding on the publisher implementation and after abandoning C based on it’s lack of a JSON library that made sense I tried go and then considered java, other JVM-based and several dynamic languages… In the end C was the only choice because of it’s size, load latency and runtime.

 
5 Comments

Posted by on 2012/03/10 in nosql, ProgLang, VOIP

 

Tags: , , , , , ,

Response to “Seven Languages in Seven Weeks”

Bruce Tate is a good writer and recently he published a book titled: “Seven Languages in Seven Weeks”. I do a lot of career development so I completely agree with the premise, however, the first place I get lost is the selection of languages:

  • Ruby,
  • IO
  • Prolog
  • Scala
  • Erlang
  • Clojure
  • Haskell

Initially there is nothing wrong with this selection. Tate tells the reader that the choices were made my asking his readers. And at first glance this might makes sense (blame the reader), however, it’s more dubious than that.

As I rebuked a recent blog for it’s survey results pertaining to Agile because the sample group were in relative social circles to the author. I believe the same can be said here. About the only thing in Tate’s favor, however, is that the words “practical” or “pragmatic” were omitted. Had they been present then I believe that the language selection might have echoed github’s language survey.

In hindsight I should have read the TOC before I purchased the book. I already had a cursory knowledge of Ruby, I’ve been coding erlang professionally for 3 years, prolog was deprecated when Borland’s Turbo Prolog was decommissioned, I’ve reviewed Haskell and it’s of no general interest… I think erlang got it right. And as the saying goes, “lipstick on a pig, it’s still a pig” when it comes to Scala and Clojure.

If it were my book I think the list would have been a little different:

  • go – the languages solves some concurrency and messaging issues in many other languages, it’s also statically linked.
  • erlang – lightweight processes, fast, has momentum
  • python 3 & perl 6 and PHP(Facebook) - These updates have been in development a long time. It’s critical to understand whether the new versions are worth the mid share or if they should be deprecated.
  • modern C or C++ -
  • groovy – It’s java lite and while I do not have any practical experience with it, since I do a lot of development in python, perl and ruby this makes sense in the JVM environment.
  • serverside javascript (NodeJS, MongoDB, etc) – another up and comer. This is probably more like a 1/2 week experience, however, just because you know browser javascript does not mean that you’ll be successful on the server side.
  • R – The google-ites and Facebook-ies are going crazy with analytics and now that the “social” aspect has entered just about every website tools that render information about the business are becoming critical. R has a great many tools to help out. Hopefully one does not need a Phd in math to be successful.

What sets my list apart from Tate’s is that they are look to the future. “Where are we going?” not “What’s slipped between the cushions?”

As a sidebar, I have another list that I think might be interesting: “Seven Frameworks in Seven Days”. You’re not going to become an expert in seven days but you might know enough to make a choice for your next project based on that experience:

  • TornadoWeb or Cyclone (python) – very capable frameworks but they are even driven.
  • Mojolicious (perl) – another event driven framework.
  • Sinatra (Ruby) – something to attract ruby-ists. It’s as capable as those above.
  • Limonade (PHP) – PHP is powering back up thanks to Facebook’s compiler.
  • Orbit (Lua) – Lua was conceived in the vacuum of Brazil and has an adopted home in World of Warcraft. At some point those programmers are going to want to break out of the game into the real world.
  • Snap (Haskell) – It’s fast.
  • Nitrogen (erlang) – interesting GUI, comet, baked right in.

One reason for the entries in this list is that the language portion of the exercise is trivial. Micro frameworks are not capable of running an enterprise but it’s low cost of entry is going to get things started so that your burn rate is smaller.

While Snap and Nitrogen are interesting in their own right that’s about it. They will not likely be here in 2 or 3 years but the ideas are great.

Thanks for reading.

 
Leave a comment

Posted by on 2011/12/20 in future

 

Tags: , , , , , , , , , , , ,

In response to your interest – in case you want to hire me.

Hi Richard,

Thank you for considering this position. I have been going over your CV and it looks interesting enough to take it to the next stage.  As for helping us better understand where your strengths are as far as your current software engineering capabilities I would like you to answer the following questionnaire.

Cheers,
Ofer

 

Question: Write a simple sorting algorithm (array of numbers) in python and explain it.
Answer:

my_numbers_that_need_sorting = [4,3,4,6,3,6,9]
my_numbers_that_need_sorting.sort()
## there are plenty of APIs out there that do sorting. 
## There is absolutely no reason for me to open my 
## Knuth books and read anything on merge sorts, hash sorts, 
## btree sorts, insert sorts etc. People smarter than me 
## are getting their Phd(s) on the subject.

 

Question: What parts of Python don’t you like and why?
Answer:

I hate the indenting.

 

Question: What is Bitcoin and what do you think about it?

Answer:

 I’m not an economist so I don’t know what the long term ramifications are. As a skeptical systems person I hope there are no bugs.

 

Question: What other software projects have you been involved with? Do you have a github account/website with code you wrote?
Answer:

My code is proprietary. I do give back from time to time. Recently I patched a bug for MongoDB… for fun. Their data “import” utility supports importing JSON, CSV and TSV.  When loading TSV and CSV files the leading and trailing whitespace is removed from each cell, however, the TAB character is considered whitespace in the TSV file import code. Therefore, if the first field were an empty field then the data would skew(shift by one cell) and the load would fail. So I identified the offending line of code, wrote a patch, tested it and submitted a patch to the dev team.

 

Question: What’s your favorite programming language (besides Python)?

Answer: 

I do not have a specific favorite… but if I had to choose it would be one that does that job. And as a manager I want to make sure that there are resources that pick up the standard when the time comes. (business continuity see http://github.com/languages)

ASM – when performance is absolutely necessary and I don’t need floating point. I hate floating point in assembler)

C – when performance is absolutely necessary and I need floating point or integration with other APIs like libcurl or the gtk.

C++ – when I have to bugfix someone else’s code.

perl – nice for reporting, ETL and other batch operations. (parrot and perl 6 have me concerned about compatibility)

python – same as perl. many of the APIs make things easier than perl. I also like tornado/cyclone and Django for webapp frameworks. (I like easy_install!!)

Java – java is the new cobol. there are APIs and frameworks for just about everything. There was a time when java had a manageable library but now it’s too big and it has fractured. Not to mention that the JDK is semi open source and then there is the Oracle factor.  J2EE is also a well known framework but it is the 1200lb gorilla in the room.

Scala and Clojure – interesting functional languages. If they were real languages instead of using the JDK. During initial development it is interesting that the languages can interop with traditional JDK libs, however, in the end apps like “lift” are simply calling Jetty. That makes a functional call stack not to functional.

erlang – interesting but not all ‘that’. While it works well for long running process and it’s lightweight processes make for some interesting parallelization it’s better for longer running tasks like phone switches and audio codecs. It’s not well suited for messaging. There are so many other better solutions. Mnesia is useful. It’s nice that it’s integrated into the language, however, there are plenty of warts there and there are so many other DBs that make more sense.

haskell – bloated and disorganized package manager.

prolog – In 20 years I have not found it useful… or in production. Erlang does a better job and Turbo Prolog is long since gone.

R – I would like to use this language, however, I have not found a use-case for it yet. It generates nice graphs and charts… but it requires Intel Fortran to build it. So I’m not sure I want this many dependencies.

LUA – another interesting idea.  Nice that the language is small enough to compile quickly and there is a jit. It’s also interesting that it integrates with C instead of the JDK.

.NET – I’m not a fan of anything that creates a lock-in and Mono while it runs on *nix is not a real viable solution in the long term. Mono is loosely glued together.

Thank you. I hope to hear from you shortly.

/r

 
Leave a comment

Posted by on 2011/07/01 in for hire, ProgLang

 

Tags: , , , , , , , , , , , , , , ,

Should Go replace my use of Python?

Here is an interesting post that posited the question in my title: Experience porting 4k lines of C code to go http://bit.ly/jm0Qws

There are a lot of reasons to use GO. I like that it’s from Google but I don’t like that there is a release often approach. I need something that is a little more stable than that. Granted this offers some justification for deploying packages and the like and using goinstall in order to deploy and update packages as new releases of GO are made available. There is also something to be said about the monolithic codebase, however, that flies in the face of this deploy approach.

But I like the compiled performance, channels and the wealth of packages (It needs more like a performant web framework, templates and production ready database adapters.)

Go, while cool, is still a little half baked. Where python and perl are still up to the challenge.

 
Leave a comment

Posted by on 2011/06/22 in ProgLang

 

Tags: , , ,

seven in seven?

I learn a new language at least once a year. It’s just something that I have tried to do since I started taking my profession seriously(1988-ish). Recently I started to get the itch to learn a new language and it did not take long to select one.

I had been working with ZMQ (ZeroMQ) for a while and luckily for me they have example code in a number of client languages. Since ZMQ is implemented in C, they have plenty of C examples but curiously enough all of those examples have Lua versions too. The remainder of the examples vary from language to language.

I do not select languages because of the geek factor or the cool factor but for it’s ability to shorten the development cycle, the tools it provides, community support, the development pool, community activity and viability in business. And using this criterial I had initially dismissed Lua.

For example, there have not been any releases or patches in over 5 years even though Lua is the scripting language used in WOW(world of warcraft). According to github Lua is not in the top 86% of the languages stored there. The community seems to be very protective and a little snarky. And the origin of the language is based on some misguided protectionism on the part of the Brazilian government. And finally, performance.

So in the last 24 hours or so a couple of things have changed. First of all the Tiobe Community Index released some new numbers that suggest that Lua is making moves. Although the sudden moves make me suspicious. Secondly, Snarky people never really bother me. Next, WOW has a different sensibility when it comes to application correctness than banking applications.

Of course, nothing is going to change the origins of the language and I’m not sure how crazy I am about the fact that it was developed blindly but who cares. The language and it’s tools compile and install more simply and easily than erlang. And to respond to a tweet about concurrency primitives:

for n=1,NBR_WORKERS do
local seed = os.time() + math.random()
workers[n] = zmq.threads.runstring(nil, worker_task, self, seed)
workers[n]:start(true)
end

I cannot say much about the benchmarks except that they would suggest that the language, in this testcase, is exceptional. And so even if the language does not support IPC for itself this mechanism might be even better for all the reasons one might use an MQ.

So for the time being I think that Lua is still on the short list and as soon as I break free I’m going to take a much closer look.

 
Leave a comment

Posted by on 2011/06/08 in 7in7

 

Tags: , , , , , ,

 
One Page Docs

Creating a library one page at a time.

One Page Bugs

Reducing the friction of writing and fixing bugs or features.

Follow

Get every new post delivered to your Inbox.

Join 223 other followers