HA server in “Split Brain” mode?

What is a High Availability Server Configuration?

HA availability server configuration often results from the sharing of a database between two servers.  One server is Primary or Active, and the other server is Secondary or Inactive.   Generally in the case of Voice Mail servers or Contact Center servers, there is a shared trunk group that interconnects the iPBX or Call Manager to the Active Server.   Should that server fail, the call searches through the trunk group ultimately connecting with the Secondary Server which has now taken on the Primary or Active Roll.   In the case of the Contact Center both ShoreTel and CISCO use this strategy with great results.

Is it “alive”?

A condition known as “Split Brain” can result, however, causing both servers to get “confused”.    This is generally the result of one of three conditions: a server loses power or becomes disconnected from the network during a database replication between the servers; or one of the servers actually restarts during the database replication process.    When this happens database updates to each server may not be replicated to the other server and we get a Split Brain.

The first step in remediation to to recognize the condition!   It  may not be readily identifiable.  In the case of CISCO you can log into the server through the CLI and run utils service database status which will show the present condition of both servers.  If you see the status “connection status unknown” or “Primary/Unknown” or “Secondary/Unknown” you are now in a schizophrenic mode!  Both servers are up and operational but neither server is processing calls as neither server knows that it is the Primary server!  Bad things are happening!

splitbrain

If this condition exists action is required to restore sanity as quickly as possible!

Split Brain surgery!

The remediation can be summarized as restoring each server to a role, Primary/Active and Secondary/Inactive.  Once that is established you will need to pick which database is most current,  and then copy it to the other server, restoring synchronization.  Again CISCO has tools to assist with this process and if followed, this surgery can go smoothly and quickly!  Basically log into the CLI of the server that has the data you want to keep and run the command utils service database status . Then log into the other server and run the command utils service database drbd discard-node which should restore database replication to normal.   Then run the command utils service database status and you should see something that looks like this:

SplitBrain2

You might then need to double check Distributed Replication Block Device or DRBD meta data for corruption if the servers do not sync up.  This DRBD is what synchronizes data between the two servers.  Check the status as described above and if you see the DRBD error, then run the command utils service database drbd force-keep-node which will reset the DRBD and set the server to Secondary/Inactive.

Ah life in the world of HA!

 

 

 

 

 

 

 

Upgrading CISCO with Prime Collaboration Provisioning

Don’t Procrastinate!

Sooner or later you will find you need to upgrade your VoIP deployment, regardless of the vendor.   Differing upgrades is just prolonging the inevitable and increasing the complexity and the pain level.   Let’s take the example of migrating a CISCO MCS based cluster that includes a CUCM at version 7.1.5, a UCCX at Version 8.0.2, and a Unity Voice Mail at version 5.0, and some of the components have HA options!  Having put off upgrades for some time, this client will have to migrate from MCS server hardware to vmware ESXi virtual machines and upgrade to the now current 10.X release across the cluster and applications.  In the case of Unity which has been replaced by Unity Connection, this is a completely new application addition.  The complexity of this upgrade is about as challenging as they come!   Additionally, the client expectations are that impact to the production environment will be non-existent!

Build out a “Mirror Lab” system

The decision was made to build out a completely separate “lab” system and to use the same ip topology as the production system.  This in itself is an interesting set of configurations as you will still need to maintain connectivity with all network services, particularly DNS and NTP.   This might best be accomplished with a set of temporary service providers on a completely isolated network.  In this instance we made use of Prime Collaboration Deployment  (PCD) tool to migrate and upgrade the CUCM cluster consisting of Publisher and two Subscribers.   As the “lab” network was to mirror the production network, we actually had three of four subnets to configure as the HA servers were to be located at a different site than the Publishers.   The PCD was relatively painless and very useful.  We did learn quite a lot about the capabilities of this tool, and in the end, consider it to be of great value and we will continue to use it in future migrations.   See our previous blog about lessons learned!

Plan 3 hours per server per Upgrade Step up!

Understanding the time required to complete this process should be established.  The actual time to do a backup and/or a restore of a specific server will be determined in large part by the size of your deployment.   A backup for a single cluster with 100 users will take considerably less time than a backup of a multiple cluster deployment with a 1000 users!   For planning purposes we used one hour per server for installation of software including ISO, COP files and any required Engineering releases.  Each backup added an hour as did each restore.   Keep in mind that you may be making multiple upgrades to achieve your end goal.  Each upgrade will take place in the “inactive” partition. Then you will be required to switch partitions. This process will take as long as a backup to progress!   In this example we were moving from 7.1.5 to 10.6 and that would normally be a multiple step upgrade.  In the case of the UCCX it most certainly was!  So we have, in this example three CUCM servers, 2 UCCX servers, and 2 Unity servers for a total of 7 servers.  At the base level that is a minimum of 21 hours of server operations for each upgrade step!  Plan accordingly and set expectations to all stakeholders!  There will be long periods of watching computer screens and the progress bars that hopefully give you some feel of where you are the process. CISCO upgrades list the number of tasks and also estimate the time per task.  This is very helpful!

UCCXupgradeScreen

Take time to learn COBRAS!

Once we had the CUCM cluster established, we turned our attention to the migration from Unity 5.0 to Unity Connection.  This was achieved by building out a new ESXi based Unity Connection Version 10.5.  CISCO has a great tool to assist with this migration in the form of COBRAS which, if you have not used before, will take some study.  Fortunately there are many training videos on the CiscoUnityTools.com website, where you will go to download the required software.  You will need the CISCO Unity Export tool and the Unity Connection Import tool.   The Export tool needs to be installed on the Unity Server as it will build connectors to the existing Unity configuration and User database.   The tools are not difficult to learn, but do require some orientation.   You can export the configuration including users, call handlers, mail boxes, prompts and even messages.  If you set customer expectations that they will not have historical messages, you can eliminate importing messages which will simplify the process.

The Unity Connector Import tool can be installed on a Windows laptop.  You will need to download and install IBM Informix drivers to connect to Unity Connection Server using a Microsoft OBDC connector.  In this example, we moved from a system that had many less features to a system that had many more features.  Our expectation was that this would be the most painful part of the migration journey, but it turned out to be comparatively easy.   The Unity Connection Server came up with most of the old call handlers matching definitions for new call handlers, and the user database imported with out error.  We choose not to import old messages and set an expectation with the client that there was a point in time in which they would need to clear out old messages as they would not be on the new system.

List and have available all COP files!

Now the upgrade and migration of the UCCX was in fact the most challenging part of the journey.   With the Call Manager now on Version 10.5 the current UCCX 8.0.2 system could not communicate with the call manager.  At first this was not thought to be an issue.  We backed up the UCCX 8.0.2  server and built out the same version machine on ESXi.   Then we did a restore and now had a virtualized UCCX 8.0.2.   You will not be able to log in to the UCCX administration page, however, as the user database is on the CUCM, and the two systems are currently incompatible.  There is a very long list of steps to get to UCCX 10.6 and each step required a backup and restore!  The clock is ticking!

We upgraded the 8.0.2 software to 9.0.2 and found that we need a license key to be able to log into the Administration portal.   Given that we were only temporarily stopping here, and ultimately would license under Prime License Manager, we did not plan for this.  However, we wanted to make sure that all data was successfully migrated to the new version.  So we obtained a demo license through TAC to take a look at our repositories and verify that all scripts, prompts and documents etc. were successfully migrated.   We noted that the Call Control Groups were not in place, but determined that was a result of a  Jtapi version incompatibility.   We next need to apply COP files and migrate to 9.0.2SUS2 in preparation for the journey onward.   At this point, we found a error in the database replication of UCCX servers and elected to remove the secondary server, which we would add back in at a later step.  This required TAC assistance to log in as root and run a script to strip the database of any reference to the second UCCX server.

There are hardware reconfigurations that change as you move through to 10.6 so be aware of them.  As you might have done on MCS upgrades to 9.0.2, you will add RAM and maybe disk drives depending on the size of your UCCX.    So moving from the ESXi virtual 8.0.2 clone we attempted to build out the ESXi machines using Version 10.6 OVA files, but ended up having to download and use an OVA for version 9.0.2.   The upgrade to 10.6 required yet another COP file before the upgrade could be started. Again, it is important to study the various upgrade paths as you may be moving through several upgrades, patches and COP files, so keep track and write it down!  After the COP file, yet another backup!   Finally, we were able to move to 10.6! Once on 10.6 we now needed to add the HA server back into the mix.   Actually, in the future for any multiple upgrade steps it may be best to remove the HA server before starting the upgrade.   Generally this is not a CISCO supported method, but you can see how much time it cuts out of the project as you do not have to upgrade two servers, backup and restore two servers at each step of the process!

Now that we have a complete system upgraded and virtualized, we can do some testing. Specifically working with firmware and CTL issues if any!   Though we did not have any gateways to connect with the outside world, we were able to bring up phones, assign users and make phone calls to the UCCX and the Unity Connection.   What remains is scheduling the maintenance window to facilitate the “go live”.   The plan was to take the old system offline and put the new system online.  Keep in mind we used the same machine names and IP topology in the “lab” as we did in the production environment!

 

 

 

The Browser Wars taking control of “hosted” applications!

“”As more and more applications become cloud based hosted solutions, the more urgent your choice of Internet Browser will become!  The desktop warfare, in our opinion, is really getting out of control!  You would think that you can use any browser for any website you want to surf, but such is not the case.  We recently made the mistake of trying to pay our Microsoft Office Online invoice while using a Firefox browser.   Only when we switched over to a Microsoft IE browser could we complete the transaction!

In yet another painful situation while working with CISCO Prime Communications Deployment tools, we were experiencing a database access error.  This error stalled development for several hours as we tried to find the root cause of the connectivity error.   Only by accidentally switching to IE from Firefox, did we uncover that the connectivity error was a fraud perpetrated by our choice of browser.  (Granted we think the only useful thing you can do with IE is download Firefox).

WTF is Control C?

Those of you who have been working with fat client based solutions, like Microsoft Outlook might actually be somewhat discouraged from using Microsoft Office 365.   All of the usual operations like right clicking on an object to copy and paste are suddenly replaced with pre-historic multi key chord strokes like Control C or Control X.   Personally, I find both Google and Microsoft cloud solutions to be frustrating! The simple act of  high-lighting a range of text might cause you to go completely over the edge!  The application will “bark” at you with an error recommendation as  you try to replicate the desktop experience with your browser based application.

BrowserCutPaste

We have many clients who require us to use a variety of different tools when we work on those projects.  For example, some folks like Google mail rather than Outlook OWA as implemented in Microsoft Office 365.   Web conferencing tools like Webex from CISCO and Lync or Skype for Business by Microsoft have very different results depending on your choice of browsers.

Desktop Warfare winners?

Who is winning the war?   Well is seems that Google Chrome is the hands down winner with a 44% market share largely resulting from losses by Firefox and Microsoft IE.   Apple’s Safari seems to remain relatively constant and newcomer Opera does not seem to be making an impact.  These numbers hold true regardless of the platform, with Apple and Android showing the same browser preferences as their desktop competitors.

BrowserCounter

So how do you survive with this level of warfare?   We find that both Chrome and Firefox have outside developers who support the browser by creating add-ons.   Our particular favorite is Firefox as they seem to have a wider community of developers.   (We also figure anything you type into a Chrome browser is immediately searchable by the entire Internet population).   Though feature comparisons are useless, and most folks just pick a browser based on personal preferences or device (Safari is an Apple product), we think Chrome and Firefox are the most open solutions.   We are Mac freaks, but at the end of the day, WebRTC will most likely be developed by Chrome and Firefox who are not defending hardware market share or erecting proprietary application.   The kool kids typically pick one of these two browsers if they are going to change from the browner that shipped on their Windows or Apple platform.

As you use more and more cloud based applications you will become increasingly more aware of the browser warfare taking place on your desktop!  At the end of the day, you are going to end up using more than one browser to get your work done!

Hey watch how easy it is to get the password you left in your browser!

Is your Contact Center “Smartphone“ enabled?

“When it comes to customer engagement options, digital contact will surpass voice contact in contact centers in less than two years.   This prediction by Dimension Data also highlights not only are most Contact Centers unable to deal with Smartphone input like SMS or “text” messages, more than 80% do not have the IT  structure in place to manage the migration  to digital customer contact.

Text my next appointment please!

It is simple enough to test the validity of this claim.   Does your contact center currently support SMS or “text” message routing to the next available customer service agent?   Can a customer of your company send a text message that says, “Call me at  this number please”?  Intuitively you know that a text message costs less than a phone call.   We know that the info structure to support a text message is significantly less that the cost of the info-structure to support large numbers of incoming telephone lines.   So if you do not support SMS today, is it on your Contact Center road map?

“We are rushing the kids to school not waiting for the next available agent!”

Website “Chat” seems to be a feature adopted by Contact Centers but is that the correct digital engagement strategy?   It is the correct strategy for visitors to your website, but not for those dropping off the kids on the way to work!   Chat is great for website visitors,  SMS is the best solution for our highly mobile society!   If you are using Email in  your Contact Center then you are a small step away from allowing SMS and Text messages to reach your agents!  The technology is simple to implement and readily available.   Agents can hit reply and send an Text Message right back to the original cell phone, quickly and, all text messages are archived and reportable for compliance purposes.

Being able to send a “chat” to someone who is on your website is a step in the right direction, but being able to open a real time voice conversation with that website surfer has much higher customer service impact!   So, what role does WebRTC play in your Call Center?  Customers are used to hearing Music on Hold,  but with WebRTC the option for Video on Hold  is a reality.   Think about it!  Is it on your Contact Center development road map?

‘TEXT2ECC’ to 702-956-8700

The issue is not Cloud versus Customer Premise. The issue is “voice” versus “digital” engagement strategies in customer service focused call centers.   Send the keyword ‘TEXT2ECC’  to 702-956-8700 along with your email address and we can set you up with a T2E (text 2 email) account free of charge.  The video clip shows you how to access our WebRTC demo!   Give it a try and let us know what you think!

 

 

Use an Ingate SIParator and you are “virtually there”!

We have written on the subject of SBC quite extensively in the past and have also covered the easy installation of the Ingate product (see DrVoIP here).   Readers must find this interesting because the hit counter for our Ingate videos continues to grow, indicating engineers are eager to learn more about this product.   We generally regard ourselves as CISCO brats, but when it comes to Session Border Controllers, we remain deeply impressed with both the Ingate product and, most importantly, the Ingate support team!  Pre-sales support is typically as good as it gets when developing a relationship with a vendor.  Post sales support, however, is where the true value system of a company is tested and Ingate passes with high marks.

Ingate SIParator as a virtualized appliance

Ingate, began shipping product as early as 2001 and has its roots in firewall security products.  Ingate has now made its very popular SIParator Session Border controller available as a virtual software appliance.  The SIParator E-SBC, scalable from 5 -20K sessions can be obtained as either a hardware appliance or as a software package.  There are over 10K SIParators installed and working worldwide, making Ingate the “go to” knowledge base for documented SIP deployment experiences that is without equal on a global basis!   Those of you working with ShoreTel have already discovered how powerful a vmware ESXi deployment can be.   New options for fail safe, high availability and increased reliability magically appear when you virtualize your deployment!   Ingate is no different and the availability of the Ingrate SIParator as a virtualized appliance adds a significant level of both reliability and flexibility to your ShoreTel deployment.

The most widely asked question in the DrVoIP technical support forum:  “Is there a need for a Session Border Controller?”   Why can’t we just use our firewall is a common theme.  Though it is possible to use a firewall to do a SIP trunk implementation, it is not our best practice recommendation to use a firewall in that way.  Even firewalls with AGL SIP functionality fall short of the wide rage of features needed for true SIP arbitration.   We are firm believers that firewalls already have enough work to do and are being attacked even more ferociously every day by a wider group of hackers and evil doers than ever before.   If you are committed to using a “firewall” to do SIP deployments, then we urge you to consider at least using an Ingate SIParator Firewall as a best of breed solution!

A dedicated Session Boarder Controller

Session Border Controllers have a lot of work to do!  The concept of normalization alone could fill a text book.  The fact is,  not all SIP implementations are equal.It is often necessary to swap SIP message headers to achieve the desired results!   Try getting your firewall, unless it is a SIParator, to do a SIP message header translation and you will quickly understand why a dedicated Session Boarder Controller makes sense!

IngateFeatures

The software SIParator is easy to obtain, easy to install, easy to configure, and easy  to license.  Ingate has adopted a pay as you go philosophy, and though the software product scales from 5-2000 channels, you only pay for what you use!  In fact, Ingate is so confident in the adoption rate of its product over competitors,  they offer a 30 day free trial.  Just click here to take advantage of this outstanding offer.

The video is Part one of a two part video on the product!   Part one shows how to obtain, download, and install the virtual SIParator software package.  Part two goes through the configuration of the SIParator on a ShoreTel system for use in SIP trunking deployments.  This material was previously covered in our YouTube video on Ingate and that material is still relevant!

Kudos for Ingate

Lastly, we want to commend Ingate not for having a great product,  but for the quality of the support they offer the entire industry by an ongoing commitment toward the education of the market place on SIP and, now WebRTC technology.   We are not talking about thinly masqueraded advertising, but serious SIP education programs for serious technology students, and a demonstrated sincere desire to advance the state of the art!  They offer an endless variety of webinars,  seminars, ebooks and even work in partnership with the SIP school to further develop and educate industry stake holders.    Excellent work  Ingate and well done! – DrVoIP

[youtube]d89gEZhRMT8[/youtube]