Monitoring Lync 2013 with MRTG

Microsoft's Lync Server 2013 includes a number of reporting options, but no references on using Open Source technologies.


Saturday, 1 March 2014

(also known as Monitoring Lync with Open Source Tools)

Microsoft s Lync Server 2013 includes a number of reporting options, however they are server-focused and don't lend themselves well to external queries.Many network telecommunications providers and technology organisations use MRTG and RRD to monitor key aspects of their networks, devices and environments.

In fact, I use MRTG and RRD to monitor all aspects of our infrastructure, from Hypervisor metrics (CPU, RAM, HDD, etc.), individual guest virtual machine metrics, and network infrastructure devices such as our routers, firewalls, load balancers and various other appliances.

MRTG simply takes a number of values, typically 2, and stores the result in a local dataset.  Results are then graphed using MRTG s graphing tool, or using RRD for more precise output.

Data is normally provided to MRTG via SNMP counters, although it can take arbitrary values from any data source, with the right approach.

Lync Monitoring Reports

Lync Server 2013 includes a set of standard reports that are published by Microsoft SQL Server Reporting Service, within a hosted or on-premises environment. These reports, which are accessible by using a web browser, provide usage, call diagnostic information, and media quality information   on an overall or per-user basis.

Hosting and Service Providers

Hosted Lync environments are typically deployed in a multi-tenant state   meaning the service can be provided to a number of customers simultaneously from the one environment, segregating data and usage between customers.

The Lync Monitoring Reports, whilst having the ability to report per-user, don't have any visibility of tenancy groups or customers, which are typically based on individual SIP domains (such as customer1.com, customer2.co.uk, customer3.com.au, etc.)

It is quite possible that Microsoft will release this capability in some form in future Lync releases (2014? 2015?), however today this is not available. There are a few commercial solutions available, but these are typically expensive, proprietary and standalone solutions.

So let s roll our own!

Simplified Topology

For the purposes of simplification, we ll assume a relatively flat topology, with the Monitoring Server role separated from the Front End server.

Extracting Lync Statistics   Users Online

There are a number of ways to extract statistics, such as WMI counters and by querying the Central Management Store directly. The latter is the method I have chosen, as the database contains additional valuable information such as client version details.

Any of your Front End servers can be queries, as the databases are replicated between all Front End instances.

In a default Standard Edition installation, the Central Management Store is a SQL Express instance called [servername]\rtclocal, whilst the Enterprise Edition can utilize a full SQL instance for this store.

In any case, the table structures remain the same.  The first step is to return the recordset of current active Lync sessions. This is useful to test your connectivity, and forms the basis for further analysis.

Select (cast (RE.ClientApp as 
    varchar (100))) as ClientVersion, 
    R.UserAtHost as UserName, 
    Reg.Fqdn
    From 
	rtcdyn.dbo.RegistrarEndpoint RE 
    Inner Join 
    rtc.dbo.Resource R on R.ResourceId = RE.OwnerId 
    Inner Join 
    rtcdyn.dbo.Registrar Reg on Reg.RegistrarId =   
    RE.PrimaryRegistrarClusterId 
    Order By ClientVersion, UserName 

Lync defines an actual user session as an  Endpoint , hence the table names.  This query returns a recordset of the currently active sessions, and is updated based on Lync s session polling interval.

This is useful information, but let s refine the results so as to see the total number of users online, and also the total number of unique users.

Select count(*) as totalonline, count(distinct UserAtHost) as totalunique
From rtcdyn.dbo.RegistrarEndpoint RE
Inner Join
rtc.dbo.Resource R on R.ResourceId = RE.OwnerId
Inner Join
rtcdyn.dbo.Registrar Reg on Reg.RegistrarId =
RE.PrimaryRegistrarClusterId

Why are unique users important? This gives administrators visibility of how many users are using multiple devices.  Where there are more active users than unique users than, this shows that one or more users are logged on simultaneously on different devices.

This example shows a difference of 1 between total online users and total unique users   therefore 1 user is logged on with two devices.

By integrating into MRTG, results like this can be generated (see below for instructions on how to do this).

Extracting Lync Statistics   Media & Application Usage

Whilst the number of users online at any given time is very useful, sometimes you d like to see _what_ your users are doing at any given time.

Activity usage is recorded in the Monitoring Server (typically a dedicated SQL server, or sometimes collocated on a Front-end Server in smaller installations).  Rather bafflingly, or perhaps a reminder of the legacy of Lync, one of the key Monitoring databases is still referenced by its old Live Communication Server prefix   LcsCDR.

Within this database lies SessionDetails, a logging table of, you guessed it, sessions.

A bit set that indicates the media type of this session. Listed are the definitions of the types:

Media Type Bit Set
 IM 1
FILE_TRANSFER 2
REMOTE_ASSISTANCE 4
APP_SHARING 8
AUDIO 16
VIDEO 32
APP_INVITE 64

As we re looking to graph various usage statistics, we only need to return values for sessions within the polling period of MRTG or RRD   5 minutes or 300 seconds.

The example below returns the number of sessions with the IM (Instant Message) bit set.

SELECT count(*)
FROM [LcsCDR].[dbo].[SessionDetails] s
where (MediaTypes & 1)=1

Of course, this will return the count for every session, so we need to narrow it down to the last 5 minutes.

SELECT count(*)
FROM [LcsCDR].[dbo].[SessionDetails] s
left outer join [LcsCDR].[dbo].[Users] u1
on s.User1Id = u1.UserId  left outer join [LcsCDR].[dbo].[Users] u2
on s.User2Id = u2.UserId
where (MediaTypes & 1)=1
AND s.SessionIdTime>=dateadd(minute,-5,getdate())

 

Important Hint: For those of us not living in the UTC timezone, you will need to transform the s.SessionIDTime into your local timezone.  If you live in the USA, Australia or any large country with multiple timezones, you may also need to consider this.  Google the solution with Bing. J

But what if you want to query this data for a specific userID or userIDs? Simply perform an outer join on the user table, to see this information.

SELECT count(*)
FROM [LcsCDR].[dbo].[SessionDetails] s
left outer join [LcsCDR].[dbo].[Users] u1
on s.User1Id = u1.UserId  left outer join [LcsCDR].[dbo].[Users] u2
on s.User2Id = u2.UserId
where (MediaTypes & 1)=1
AND s.SessionIdTime>=dateadd(minute,-5,getdate())

This is a simplified approach, and there are a number of performance improvements you can make to the queries above.

Passing Lync Statistics to MRTG

Once you have the metrics you require, its time to pass these to MRTG.  An almost identical process is used for RRD, so in the interested of brevity we ll focus on MRTG in this example.

MRTG uses a tool called rateup, which takes 4 inputs and stores them into a local, text-based flatfile database before generating the graphs based on historical data.

MRTG itself a Perl script, to which we pass configuration data, including where to gather statistical data.

Important Hint: A *PERL* based tool used to monitor a Windows-based server environment? Sure! MRTG runs perfectly fine on any Windows platform running a Perl interpreter such as ActivePerl.  Learn more here.  The examples below are based on MRTG installed on a Windows 2012 R2 server with ActivePerl 5.16 Community Edition.

In my environment, due to my old-school approach, I use a small VBS script to query the LcsCDR database mentioned above, returning the values required by MRTG.

The values required are rather simple.

Value1
Value2
Uptime
Device Name

MRTG was originally designed to monitor network traffic, and Value1 and Value2 were originally in and out values measured in bytes.  These can, of course, be any values, representing any metric.

Uptime isn't overly relevant to monitoring Lync sessions, although if you d like to see your server uptime status in the reports, by all means

Similarly, Device Name is no longer relevant, as hopefully you re actually running Lync across multiple devices!

A typical example of my VBS script would return results similar to this (results in bold):

C:\web\mrtg>cscript lyncactivity.vbs
Microsoft (R) Windows Script Host Version 5.8
Copyright (C) Microsoft Corporation. All rights reserved.
313
55
3/02/2014 11:23:38 PM
mylinkservice

 

You then point your MRTG instance at a configuration file, which takes this information, parses it in the context of specification settings, and returns a graph.

 [pathtoperl] [path to mrtg perl script] [your lync configuration file]

Example:

C:\perl64\bin\perl c:\mrtg\bin\mrtg c:\mrtg\configs\lync.cfg

The lync.cfg file is a text file containing the configuration of each MRTG graph you want to create. A comprehensive listing of all the options available can be found on the MRTG web site.

Target[lync_sessions]: `cscript //nologo c:\web\mrtg\lyncactivity.vbs`
MaxBytes[lync_sessions]: 128
YLegend[lync_sessions]: Users
ShortLegend[lync_sessions]: Users
Legend1[lync_sessions]: Total Users
Legend2[lync_sessions]: Unique Users
LegendI[lync_sessions]: IM:
LegendO[lync_sessions]: File Sharing:
Options[lync_sessions]: growright,integer,noinfo,gauge,withzeroes
Title[lync_sessions]: IM and File Sharing Sessions

The key is the Target[name] line   this points to the source of your metrics, in this example my VBS file that returns the 4 data components shown earlier.

By parsing your configuration (.cfg) file every 300 seconds, MRTG will now poll Lync for your preferred metrics and generate graphs.

By default, MRTG returns 4 graphs (Last Day, Last Week, Last Month, Last Year) each building up over time as more and more data is stored.

The Results

MRTG can be used to show current and historical metrics, is extremely flexibly, and is a great way of building your own monitoring and reporting capability, with just a few lines of script.

Final Steps   what to do with this data?

Depending on your own monitoring environments, the data generated by MRTG and/or RRD can be incorporated easily.  Graphs are individual PNG, and raw data is in a simple flat-file format.

Many Service Providers utilize the CACTI framework to group and display various device and service reports   this is a great way to start grouping your Lync usage reports for integration into your existing monitoring.

 

Coming Soon

My next post will be a how-to integrate with RRDTool, as well as adding a few more interesting metrics.

 

Links

MRTG by Tobias Oetiker. MRTG is free software   download it, and if you use it, consider supporting Tobias!

RRD by Tobias Oetiker

CACTI   the complete rrdtool-based graphing solution.

Tags

IT, Lync, ServerManagement
Share with: 



Support this Site



Popular Articles

What is Kutamo?
Kilimanjaro 2023
Kilimanjaro 2015
Kilimanjaro 2013
Australian Postcodes
New Zealand Postcodes
Worldwide City Database

Recent Articles

Favourite Links

Kutamo Studios
ErrLog.IO
Kutamo
AfterTheCloud
Kilimanjar 2023