RSS Subscription 167 Posts and 2,769 Comments

Lync 2010 Central Site Resilience w/ Backup Registrars, Failovers, and Failbacks – Part 3

Introduction

Welcome to Part 3 of this article series. In Part 1, we started off by discussing the goal of this lab. That goal is to wrap all the information out there on how to utilize Central Site Resilience in regards to failovers, fallbacks, how redirects function, how SRV records fit in, etc… We first discussed what the lab setup is going to be using Hyper-V, and then proceeded to take a look at the base topology and configuration.  In Part 2 of this article series, I went through the sign-in process for each user.  Because the SRV record for sign-in was pointed to A-L14FE1.shudlab.net, the sign-in process was different for each user logging in.

In this Part, we’ll do a failover test and a failback test without a SRV record in place.  We’ll take a look at what happens to ClientUser1 using A-L14FE1 Pool and what happens to ClientUser2 using A-L14FE2 Pool when we take down one of the Pools.  We will then take a look at what happens when the Pool that came down comes back online. And finally, we will end our tests by seeing what happens when a second SRV record is in place.

Part 1

Part 2

Part 3

The Failover with only one SRV

As shown in Part 1, our SRV record is pointing to A-L14FE1.shudlab.net. If you recall from Part 2, when ClientUser1 signed in and connected to A-L14FE1.shudlab.net, he received no 301 Redirect and therefore was not informed of their Primary and Backup Registrar.  We also saw ClientUser2 connect to A-L14FE1.shudlab.get, and received a 301 redirect with his Primary and Backup Registrar and ClientUser2 then connected and registered to A-L14FE2.shudlab.net.

ClientUser1

Let’s start with disabling the NIC on A-L14FE1.  I wanted to see the behavior of ClientUser1.  Don’t forget, ClientUser1 initially connected to A-L14FE1.shudlab.net with no 301 redirection.  Because of this, it has no idea what its backup registrar is and there is no additional SRV records other than the one that has you connecting to A-L14FE1.

After approximately 30 seconds, ClientUser1 gets disconnected.

Remember in the Topology, we had a failover detection time of 30 seconds.  I let this sit here for about 5-8 minutes and it stayed disconnected.

ClientUser2

First thing I did is re-enable the NIC on A-L14FE1 and let ClientUser1 sign back in.  I want to be in a normal operational state.  Now let’s disable the NIC on A-L14FE2.shudlab.net and see what happens with ClientUser2.  What we should see happen is that it fails over to A-L14FE2.shudlab.net.  The reason being is that it signed in using the SRV record, received the 301 redirect, and was informed of both its Primary Registrar and its Backup Registrar.  While ClientUser2 should be able to fail over, don’t forget about the Endpointconfiguration.cache file.  It this client were to sign out and sign back in, it would not use the SRV record and connect directly to A-L14FE2.shudlab.net.  Because of that, it would no longer know about its Backup Registrar and would have no idea where to reconnect.

But let’s take a look at both scenarios.  Let’s first take a look at if it fails over properly since the last sign-in it completed it received a 301 redirect.

We’ll go ahead and disable the NIC on A-L14FE2.shudlab.net.

After around 30 or so seconds, ClientUser2 signs out.  What I would expect now is ClientUser2 connects to A-L14FE1.shudlab.net since again, when ClientUser2 initially signed in, it received a 301 redirect which informed ClientUser2 of both the Primary and Backup Registrar.

And just as I thought, ClientUser2 connects to A-L14FE1.shudlab.net

After re-enabling the NIC on A-LyncFE2.shudlab.net, within 40 seconds (which is the failback detection time), ClientUser2 reconnects.

The Failover with a second SRV pointing to the secondary pool

So we’ve seen ClientUser1 fail to connect when A-L14FE1.shudlab.net goes down because ClientUser1 never received a 301 redirect message and because there is no 2nd SRV record in the environment.  Let’s go ahead and add our second SRV record with a priority of 10.

And just to verify A-Client1 sees the change, let’s do a new nslookup.

Ok, now let’s run the same test we initially did.  I’m shutting down A-L14FE1.shudlab.net server’s NIC.  What we saw earlier on in our tests is that ClientUser1 would just sit signed out with nowhere to go.  What should happen now is the Lync client signs out, ends up finding the second SRV record, and now is able to connect to the second pool, A-L14FE2.shudlab.net.

After around 30 or so seconds, ClientUser1 signs out.  Let’s see if it picks up the 2nd SRV record and then signs into A-L14FE2.shudlab.net

After a little bit of waiting, sure enough, ClientUser1 can now successfully sign into A-L14FE2.shudlab.net

Now let’s take a look at a Netmon Trace and see what exactly ClientUser1 did for DNS lookups.

When the server is down, we see the client query for _sipinternaltls._tcp.shudlab.net.  We can see in the red highlights at the bottom, we have a-l14fe1.shudlab.net and l14fe2.shudlab.net returned.  Part of the data return is obviously the priority information.  What we end up seeing below is ClientUser1 ends up trying to connec tto a-l14fe2.shudlab.net because it knows it is having problems connecting to a-l14fe1.shudlab.net.  Because of that 2nd SRV being in place, ClientUser1 found it, is doing another query for a-l14fe2.shudlab.net to find its IP address, and now makes a connection to this server.  Voila, we now have a failed over client.

 

Reviewing some key points

  • If a client gets redirected to a server, it is a 301 redirect that informs the client of their Primary and Backup Registrar.  If the Primary happens to be down (for example, if you connected to a Director), the client will automatically be able to connect to their Backup Registrar.  If their Primary happens to be operational, the user connects, and their Primary Goes down, that user will failover to their Backup Registrar.
  • If a client has signed in at least once, their Primary Server has been cached into a file called Endpointconfiguration.cache.  That client will always connect directly to that server instead of potentially getting a 301 redirect.  It is because of this it is very important to have multiple SRV records in the environment to increase the chance that regardless if a server is cached in the Endpointconfiguration.cache file, that client will have another means to find another registrar in the environment.  If that registrar happens to be another pool that is not their primary, the user will get a 301 redirect to their Primary and Backup Registrar Pool.
  • A registrar does help as it will redirect clients to their correct pool and provides the clients with a 301 redirect thus letting the client know what their Primary and Backup Registrar is.  But as you have seen, do not completely rely on this due to the client caching server information in the Endpointconfiguration.cache. You absolutely should have at least 2 SRV records with two different priorities to ensure a client will failover to another registrar regardless if you have a Director in your environment or not.

 

Conclusion

Well folks, that is all for not just Part 3, but the entire article series. In this part, we performed a failover test and a failback test without a SRV record in place.  We then took a look at what happens to ClientUser1 using A-L14FE1 Pool and what happens to ClientUser2 using A-L14FE2 Pool when we take down one of the Pools.  We then took a look at what happens when the Pool that came down comes back online.  And we finally ended our tests in seeing what happens when a second SRV record is in place.

Hopefully these articles have helped you understand more on how the deployment of Lync 2010 Central Site Resilience works.  Feel free to ask questions in the comments below and I will do my best to answer questions.

 

Share

Lync 2010 Central Site Resilience w/ Backup Registrars, Failovers, and Failbacks – Part 2

Introduction

Welcome to Part 2 of this article series. In Part 1, we started off by discussing the goal of this lab. That goal is to wrap all the information out there on how to utilize Central Site Resilience in regards to failovers, fallbacks, how redirects function, how SRV records fit in, etc… We first discussed what the lab setup is going to be using Hyper-V, and then proceeded to take a look at the base topology and configuration.

In this Part, I will go through the sign-in process for each user.  Because the SRV record for sign-in is pointed to A-L14FE1.shudlab.net, the sign-in process will be different for each user logging in.

Part 1

Part 2

Part 3

The Sign-In Process

As shown in Part 1, our SRV record is pointing to A-L14FE1.shudlab.net.  This means when ClientUser1 connects, he will connect directly to his home server.  When ClientUser2 connects, he will connect to A-L14FE1.shudlab.net, will get authenticated, and will receive a 301 redirect to A-L14FE2.shudlab.net.

ClientUser1

This is a completely fresh client.  No Lync 2010 client has signed in and therefore, there is no cached folder with any endpointconfiguration.cache file with a cached server.  The Lync 2010 client will sign in for its first time and do the SRV lookup.

Let’s enable logging on A-Lync14FE1 since that is the server that will be authenticating all Lync 2010 logins.  Essentially, if we had a Director, it would be doing the exact same thing in this situation.  We’ll start logging by going to: Start > All Programs > Microsoft Lync Server 2010 > Lync Server Logging Tool.  Enable the SIPStack option, choose Information, and then choose All Flags.  Then Click Start.

Now that we’re logging, we’ll hop back onto A-Client1, and sign into the Lync 2010 client using the ClientUser1 user account.  We can see that the Lync 2010 Client on A-Client1 signed in successfully and we can see in the Configuration Information (Control + Right-Click on Lync Icon in Notification Area) that we’re connected to A-L14FE1.

Heading back over to- AL14FE1.shudlab.net, let’s take a look at the Lync Logs.  We click Stop and then Analyze to view the logs in Snooper.  Make sure the Lync 2010 Resource Kit tools are installed otherwise Snooper will not launch.  Taking a look at the log, we can see a bunch of incoming Subscribes and a bunch of incoming Service messages.  This Pool has authenticated this user and is now servicing this user.  We see no SIP Redirects and therefore, this ClientUser1 has no idea what its backup registrar is.

Taking a look at the endpointconfiguration.cache file, we can see that this client now has A-L14FE1.shudlab.net cached.  It will no longer try to do an SRV lookup unless it cannot connect to the server specified in this endpointconfiguration.cache file.

ClientUser2

Just like ClientUser1, this is a completely fresh client.  No Lync 2010 client has signed in and therefore, there is no cached folder with any endpointconfiguration.cache file with a cached server.  The Lync 2010 client will sign in for its first time and do the SRV lookup.

Let’s go ahead and start logging again on A-L14FE1.  Refer to the ClientUser1 section on  how to log.  We’re logging on A-L14FE1 instead of A-L14FE2 to see the difference in how A-L14FE1 responds when a user is logging in from a different pool.

Now that we’re logging, we’ll hop back onto A-Client2, and sign into the Lync 2010 client using the ClientUser2 user account.  We can see that the Lync 2010 Client on A-Client2 signed in successfully and we can see in the Configuration Information (Control + Right-Click on Lync Icon in Notification Area) that we’re connected to A-L14FE2.

Heading back over to A-L14FE1.shudlab.net, let’s take a look at the Lync Logs.  We click Stop and then Analyze to view the logs in Snooper.  Make sure the Lync 2010 Resource Kit tools are installed otherwise Snooper will not launch.  Taking a look at the log, we see a ton less data than we did when ClientUser1 logged in.  Again, this is because ClientUser1 was homed on A-L14FE1 whereas ClientUser2 is not homed on A-L14FE1 but is rather homed on A-L14FE2.

Because ClientUser2 is homed on A-L14FE2, when ClientUser2 was initially connecting to A-L14FE2, we can see the authentication occurring, and then A-L14FE2 issues a 301 redirect to ClientUser2.  In the data on the right of the log, we see that in this 301 redirect message, the user is notified what their Primary Registrar is (A-L14FE2.shudlab.net:5061) and their Backup Registrar (A-L14FE1.shudlab.net:5061).  This is why Doug, in his blog article, talked about the benefits of Directors.  A Director will issue 301 redirect for all authenticating users.  This way, clients will know about Primary and Backup Registrar.

But Chris’ article talks about the endpointconfiguration.cache file.  When this ClientUser2 successfully connected, he put ONLY his Primary Registrar into this file.  Because of this, on subsequent attempts, ClientUser2 will connect directly to A-L14FE2.shudlab.net instead of doing an SRV lookup, connecting to A-L14FE1.shudlab.net, and then getting a 301 redirect.  It’s because of this Chris’ article mentions that you should still have multiple SRV records.  They are needed to handle this situation.

Taking a look at the endpointconfiguration.cache file on ClientUser2, we can see that the Backup Registrar is not cached.

Conclusion

Thanks for reading Part 2.  In this Part, I went through the sign-in process for each user.  Because the SRV record for sign-in was pointed to A-L14FE1.shudlab.net, the sign-in process was different for each user logging in.

In Part 3, we’ll then do a failover test and a failback test without a SRV record in place.  We’ll take a look at what happens to ClientUser1 using A-L14FE1 Pool and what happens to ClientUser2 using A-L14FE2 Pool when we take down one of the Pools.  Finally, we’ll take a look at what happens when the Pool that came down comes back online.

To read Part 3 of this article series, click here.

 

Share

Lync 2010 Central Site Resilience w/ Backup Registrars, Failovers, and Failbacks – Part 1

Introduction

I’ve seen quite a bit of discussion on Lync 2010 Backup Registars.  There’s been some PowerPoints which show how a Backup Registrar works, there’s been some blogs that discuss how clients are handed back a primary registrar and backup registrar, etc…  What I have not seen, is one article that wraps everything together and shows it all in action.  That is what this multi-part article is going to do.

In this first part, we’ll go over what the lab setup is going to look like and take a look at the base topology and configuration before we really start diving into the actual scenario testing.

Part 1

Part 2

Part 3

General Information

Now the two best blog articles (both from Microsoft employees) I’ve found in regards to Backup Registrars (other than the official Planning for Central Site Resilience which you can find here) are:

To summarize the articles above to set the stage for my article, you will see below that I am not including a Director.  The key point Doug makes in his article is that when a Lync 2010 Registrar, in Doug’s article it is a Director, redirects a user to their home pool, that 301 redirect contains the user’s Primary Registrar and their Secondary (Backup Registrar).  Now any Registrar in the deployment can accept a user login and if that registrar is not the home pool for the given user, it issues a 301 redirect for that user.  One of the jobs of a Director is to always issue 301 redirects as one of its purposes is to handle client logins.  In my article, I have two pools and one pool will always be authenticating all users.  So in my case, I can simulate the 301 redirect piece by having Client2 that is homed on Pool2 by signing into Pool1.  That client will then get the 301 redirect.  We’ll take a look at the SIP traces later.

Chris Norman then makes a point that there is a file called Endpointconfiguration.cache which caches your last connected server.  In my example above, once Client2 gets a 301 redirect to his pool, he knows about his Primary and Backup Registrar.  On subsequent connects, he will use the information in his Endpointconfiguration.cache and will no longer get a 301 redirect and he won’t know about his Backup Registrar anymore.  Because of this, we need another mechanism for clients to be able to find another registrar in the environment.  And Chris Norman mentioned having additional SRV records with a different priority.  This way, if Client2 looked in his Endpointconfiguration.cache file, saw that he should connect to Pool2, and Pool2 happened to be down or it went down while he was connected, the client would find the other SRV record, know how to connect to Pool1, and voila, he is connected.

That is a very quick summary.  I would recommend giving the above two blog articles a read.  Or just read on as I will explain everything and show multiple scenarios.  Throughout the rest of the article series, I will be showing the following scenarios:

  • Showing Failover without SRVs.  What happens to ClientUser1 on A-L14FE1 (Pool1) when we take down one of the Pools.  What happens to ClientUser2 on A-L14FE2 (Pool2) when that same pool went offline?
  • How does fallback work?
  • What changes after we’ve cached our Primary Registrar for both users in the Endpointconfiguration.cache file?
  • What changes if we add a secondary SRV record?

Lab Setup

Guest Virtual Machines

One Server 2008 R2 Enterprise (Standard can be used) SP1 x64 Domain Controller which Certificate Services installed as the Enterprise Root Certificate Authority.

Two Server 2008 R2 Standard (Enterprise can be used) x64 (x64 required) Member Servers where Lync Server 2010 is installed. Both of these Lync Server 2010 Servers are installed as two separate Lync Standard Edition Pools.

Two Windows 7 X64 Enterprise (Professional and Ultimate can be used instead) Client Machines where the Lync 2010 Client will be installed.  One client machine is using a user account to connect to one Standard Edition Pool.  The other client machine is using a separate user account to connect to the other Standard Edition Pool.

Assumptions

  • You have a domain that contains at least one Server 2003 Domain Controller (DC/GC).
  • The client machines have network connectivity and can talk to either Lync 2010 Front End Servers
  • You are using the latest updates on this Lync 2010 infrastructure.  At the time of writing this lab, we are utilizing Cumulative Update (CU) 5.

Computer Names

Lync 2010 Standard Edition Front End Server –A-L14FE1

Lync 2010 Standard Edition Front End Server –A-L14FE2

Domain Controller  / Global Catalog /  Root Enterprise CA – A-DC1

Domain Controller  / Global Catalog – A-DC2

Client  – A-Client1 (Lync User Account Homed on A-L14FE1)

Client  – A-Client2 (Lync User Account Homed on A-L14FE2)

Configuration of  Domain Controllers

Operating System: Windows Server 2008 R2 SP1

Processor: 1

Memory: 1024MB static

Virtual Network Type External NIC

Virtual Disk Type – System Volume (C:\): 60GB Dynamic

Note: In a real-world environment, depending on the needs of the business and environment, it is best practice to install your database and logs on separate disks/spindles. I installed Active Directory and Certificate Services on the same disks/spindles for simplicity sakes for this lab.

Configuration of Lync 2010 Standard Edition Front End Servers

Operating System: Windows Server 2008 R2 SP1

Processor: 2

Memory: 2048 MB static

Virtual Network Type External NIC

Virtual Disk Type – System Volume (C:\): 60 GB Dynamic

Configuration of Client Machines

Operating System: Windows 7 Enterprise X64

Processor: 1

Memory: 1024 MB dynamic (512 startup)

Virtual Network Type External NIC

Virtual Disk Type – System Volume (C:\): 60 GB Dynamic

IP Addressing Scheme (Corporate Subnet)

IP Address – 192.168.1.x

Subnet Mask – 255.255.255.0

Default Gateway – 192.168.1.1

DNS Server – 192.168.1.51 Primary (A-DC1) / 192.168.1.55 (Secondary)

Base Topology and Configuration

Lync 2010 Topology

What I did was create two Central Sites.  The A-L14FE1.shudlab.net Standard Edition Pool Server will go in Chicago.  The A-L14FE2.shudlab.net Standard Edition Pool Server will go in Detroit.  There wasn’t a particular reason I decided to go with two Central Sites.  I could have put both Standard Pools in the same site.  The Lync 2010 Topology does let you have two Pools in the same Central Site and set the other Pool as a Backup Registrar just like if both Pools were in separate Central Sites.   Now I’m not going to go into detail on why we would want to use multiple Central Sites, but here’s a breakdown of some of the reasons I can think of off the top of my head:

  • Want to have separate Edge Pools for Media Traffic.  That way, we can put one set of users in Chicago and have it use Chicago pipe for Chicago users and we can put the other set of users in the other pool (for example, Detroit which would most likely be an entirely different region much farther away) so those users use an entirely different pipe for Edge media traffic.
  • We want to associate a specific Survivable Branch Appliance (SBA) or Survivable Branch Server (SBS) with a specific Pool.  This is done by creating a Branch Site within the Central Site.  The Branch Site would contain the SBA or SBS and because the SBA or SBS is within the Branch Site which is in a Central Site, that SBA is associated with the Pools in that Central Site.
  • We want more granular control over Call Admission Control.  I’m not going to get into detail on CAC, but CAC uses something called Regions which is associated with a Central Site.  If we have our Pools in different Central Sites, it allows to create more Regions which gives us more control over how we link Regions together and therefore, how we can route Audio/Video traffic across our sites.
  • A thing to note is that when you have a PSTN Gateway associated to a Central Site, that PSTN Gateway cannot be associated to a Mediation Server in another Central Site.

With that out of the way, let’s take a look at our topology from a Site level.  As stated, we have two Sites: Chicago and Detroit.  We only have one SIP Domain and that SIP Domain is shudlab.net.

As you can see, we have two Standard Edition Front End Pools: A-L14FE1.shudlab.net and A-L14FE2.shudlab.net.

If we take a look at the Properties of our A-L14FE1.shudlab.net Pool we can seeA-L14FE2.shudlab.net is set as our Backup Registrar.  I set the Failure detection interval (sec) to 30 seconds and Fallback interval (sec) to 40 seconds. Around 30 seconds the Lync 2010 client that is connected or connecting to that Primary Registrar that is no longer available will fallback to this Backup Registrar.  After their Primary Registrar is back online, shortly after 40seconds of their Primary Registrar being back online and the client detecting that their Primary Registrar is operational, the Lync 2010 client will automatically sign out and sign back into their Primary Registrar.

The defaults for these values are 300 seconds for Failure detection interval and 600 seconds for Failback interval.  But because this is a lab, and I will be taking and embedding video in this article series for you to witness the failover and failback, I didn’t want you guys to wait for long hence the lower values I have set.

If we take a look at the A-L14FE2.shudlab.net Pool’s Resiliency settings, we will see a similar setup where A-L14FE1.shudnow.net is the Backup Registrar.

Note: Keep in mind that an SBA and/or SBS cannot be the Backup Registrar for a Pool.  You can, however, have a Pool be the backup Registrar for a SBA and/or SBS.  This is for registration only.  You can still have redundant voice routes that use the voice gateway and/or SIP Trunks in any location you would like.

Domain Name System (DNS)

If we take a look at the single SRV record currently in our environment, we see it is pointing to the Chicago Pool, a-L14FE1.shudlab.net.  We can point it directly to the pool name instead of something like sip.shudlab.net because AD and our single SIP namespace is using shudlab.net.  If your SIP namespace and your AD (Pool using AD namespace) is different, we would have a SAN name on our Pool containing sip.<sip domain> for each sip domain in our environment.  Our SRV record for each namespace would point to its corresponding sip.<sip domain>.com.  While we could still use sip.shudlab.net, I would rather use the Pool FQDN instead so when a user connects, we can see in the logs that it is connected to the Pool FQDN rather than sip.shudlab.net.  You’ll see what I’m talking about when we start looking at logs.  I’ll make a mention of it.

Client Connectivity

We have 2 Client Computers both running Windows 7 x64 with Lync 2010 CU5 client installed:

  • A-Client1.shudlab.net
  • A-Client2.shudlab.net

We have 2  Users:

  • ClientUser1 (ClientUser1@shudlab.net
  • ClientUser2 (ClientUser2@shudlab.net)

We have 2 Pools:

  • A-L14FE1.shudlab.net
  • A-L14FE2.shudlab.net

Everything with 1 is aligned together and everything with 2 is aligned together.  Therefore:

  • ClientUser1 is using A-Client1 computer which is associated to the A-L14FE1 Pool
  • ClientUser2 is using A-Client2 computer which is associated to the A-L14FE2 Pool

Conclusion

Thanks for reading Part 1.  In this Part, I went over what the lab setup looks like and took a look at the base topology and configuration before we really start diving into the actual scenario testing.

In Part 2, I will go through the sign-in process for each user.  Because the SRV record for sign-in is pointed to A-L14FE1.shudlab.net, the sign-in process will be different for each user logging in.  We’ll take a look at this in detail.

To read Part 2 of this article series, click here.

Share

Lync 2010 Edge Servers and IP Requirements – NAT vs Public IP

To this date, I still see a lot of confusion as to the IP requirements of Lync 2010 Edge Servers.  I have seen the following questions:

  • Why do we require Public IP addresses on all 3 Lync Edge Roles when using Hardware Load Balancing?
  • Why does the Audio/Video Edge Role need a Public IP when behind a Hardware Load Balancer?
  • Why can we NAT when we’re doing DNS Load Balancing?
  • Why can’t we have a mix of Public IP and NAT’d IPs on the same Edge Server?

Have you been pondering the answer to any of the above four questions? If so, this blog article is for you.  Read on…

Edge IP Requirements and Assumptions

First of all, let’s list out Edge Networking requirements and assumptions according to the official Lync 2010 documentation (only listing the ones relevant to this document to preface my own points):

    • Two network interface cards (NICs) configured as follows:
      • Edge internal interface is on a different network than the Edge external interface.
      • The Edge external interface NIC has three IP addresses bound to it (that is, Access, Web Conferencing and A/V, with the Access IP address set to primary and the other two IP addresses set to secondary).
      • Only one default gateway is configured and it is assigned to the Access Edge external interface and pointed to the external firewall’s IP address.
    • The NIC for the Edge external interface and the NIC for the Edge internal interface are on two separate networks that do not have routing configured between them.
    • Windows Server 2008 strong host model is used on all Edge Servers. For details, see “The Cable Guy: Strong and Weak Host Models” at http://go.microsoft.com/fwlink/?LinkId=178004
    • There is a route from the network containing the Edge internal interface to any networks that contain Lync 2010 clients or servers running Lync Server 2010.
    • All Edge external interfaces use either NAT, with destination IP addresses changed inbound and the source IP addresses changed outbound combined with DNS load balancing. Or, they use publicly routable IP addresses combined with hardware load balancing.
      • A hybrid configuration with Access Edge service and Web Conferencing Edge service behind NAT and the A/V Edge service configured with a publicly routable IP address, is not supported in Lync Server 2010.

In Lync Server 2010, only 2 NICs are supported on the Edge Role

Some history information on Office Communications Server (OCS) 2007 R2

In Office Communications Server (OCS) 2007 and OCS 2007 R2, we generally saw two types of NIC Configurations:

Note: Keep in mind that while Private IPs are shown above, you can alternatively use Public IP Address on the NICs themselves.

Method #1

Every Role has its’ own dedicated NIC. This is recommended due to people having issues in the past with communications when roles share IP Addresses on the same NIC.

Method #2

It is also possible to use one NIC for the Audio/Video Edge Server, Web Conferencing Edge Server, as well as the Access Edge Server. Because of this, all 3 Edge Server Roles would have Private IPs meaning they can all be on the same NIC. You would then use a dedicated NIC for the Internal NIC.

Method #1 worked just fine out of the box with Windows 2003.  Windows 2008 and using Windows 2008 R2 both use the new Strong Host networking model which introduce some complications when using Method #1.  There are some security differences with the Strong Host model than what the Weak Host model used.  For example, if traffic comes in on one interface, it’s going to leave back out that same interface.  But with Windows 2003 networking, you can only have one default gateway.  So there are some tricks to do with multiple NICs such as assigning multiple Default Gateways and tweaking your Windows routes.  Jeff Schertz, Lync MVP, details this on his blog article here.  Generally, Method #1 will give you greater performance benefits but with how OCS scales and its sizing guidance, 2 NICs are fine.

Fast Forward to Lync Server 2010

See how complicated this all was in OCS 2007 R2?  Should I use 2 NICs?  Should I use 4 NICs?  If I use 4 NICs, should I create additional static routes?  Should I disable strong host model?  This complication is unnecessary.  This is why we have the following requirement listed above which I have again posted here:

    • Two network interface cards (NICs) configured as follows:
      • Edge internal interface is on a different network than the Edge external interface.
      • The Edge external interface NIC has three IP addresses bound to it (that is, Access, Web Conferencing and A/V, with the Access IP address set to primary and the other two IP addresses set to secondary).

We can see that based on this requirements, we must now use only 2 NICs.  In the requirements listed above, you may also have seen the following:

  • All Edge external interfaces use either NAT, with destination IP addresses changed inbound and the source IP addresses changed outbound combined with DNS load balancing. Or, they use publicly routable IP addresses combined with hardware load balancing.
    • A hybrid configuration with Access Edge service and Web Conferencing Edge service behind NAT and the A/V Edge service configured with a publicly routable IP address, is not supported in Lync Server 2010.

What this means is, if we’re using a single Edge Server or DNS Load Balancing and are using NAT, the external interface must contain all Public IP Addresses.  There’s no hybrid configuration.  If we’re using a single Edge Server or DNS Load Balancing and are using Public IP Addresses, the external interface must contain all Private IP Addresses.  If we’re using a Hardware Load Balancer, documentation has always mentioned the A/V role requires a Public IP Address.  This is still true with Lync Server 2010.  But because there is no hybrid configuration (due to simplicity), that means all Lync Server 2010 Edge Roles must now use Public IP Addresses as well.

The NIC Model in Lync Server 2010 is as follows:

We can see if we’re using a single Edge Server or DNS Load Balancing, we can use either NAT or Public IPs on the External NIC.  But if we’re using Hardware Load Balancers, we must use Public IP Addresses on the external NIC.  This brings is into our next topic.

Why do we need Public IP Addresses on our Lync 2010 Edge Server’s External Interface when using Hardware Load Balancers?

There is a lot of information to digest to fully understand why we need to use a Public IP on a Lync Server 2010 Edge Server’s External Interface when using Hardware Load Balancers.  I will do my best to explain thoroughly.

Let’s start with te following blog article which provides a good summary on Public IP requirements for the external interface of the Edge Server for the Audio/Video Edge Role: http://blogs.technet.com/b/chlacy/archive/2008/03/12/a-v-edge-and-publicly-routable-ip-addresses-part-ii.aspx

It states,

The external A/V Edge requires a publicly routable IP address for several reasons.  First, the A/V Edge server implements the STUN protocol, a mechanism whereby the A/V Edge server reflects back the IP address it saw from a user’s home router.  This home router IP address is used to enable the use of efficient media paths using the ICE protocol and is also needed to ensure proper IP permissions are set on the A/V Edge server’s 50,000 port range.  If the A/V Edge external address was behind a NATed IP, the A/V edge server would return that address instead of the address of the home router, leading to less efficient (sometimes broken) media paths and permission issues on the 50,000 port range.  A second reason for publicly routable IPs is to support UDP load balancing.  For real time audio/video traffic, UDP is the preferred protocol to transfer RTP packets.  However, UDP is a stateless protocol, so some load balancers distribute UDP packets to the servers without any context for the current session.  To mitigate this, the A/V edge server returns its external IP address on the first UDP packet of a media session, and OC or the Meeting Console client sends subsequent UDP traffic directly to that IP address instead of through the load balancer.  In order for this mechanism to work, the external IP must be publicly routable.  Note that supporting a publicly routable IP address on the external edge does not preclude a company from using a firewall.  To the contrary, Microsoft recommends that all externally facing servers be protected with a firewall…provided that firewall does not NAT the IP address.

Source Network Address Translation (SNAT)

In order to fully understand the above and what it means for Audio/Video needing a Public IP (and therefore based on what was discussed above, all other Edge Role’s would also need Public IPs for the external interface), we must first understand how Load Balancers work with Source Network Address Translation (SNAT).

To put it simply, SNAT essentially changes the source address in the IP header of the packet to be the IP of the Hardware Load Balancer.  What this means, if a Hardware Load Balancer sends traffic to a server, the Source IP will be the hardware load balancer meaning the server will return the traffic to the Hardware Load Balancer and the Hardware Load Balancer will then communicate back to the client who initiated the original communication.  The server will never directly communicate back to the originating client.

Lets demonstrate this in a one armed SNAT configuration for a typical Edge Server.  My explanation of SNAT is based on Andrew Ehrensing’s Teched Video on Exchange 2010 Load Balancing which you can find here which will explain SNAT based on all private IP Addresses.  Later on, I’ll explain how this relates to Lync Server 2010 Edge Server’s Access Edge, Web Conferencing Edge, and Audio/Video Edge IPs.

As we can see in this screenshot, we have a Hardware Load Balancer with a NIC assigned as 10.10.10.5.  Because this NIC belongs to the 10.10.10.0/24 network, we can create a Virtual IP Address of 10.10.10.6 which is what clients will connect to.

Our first packet will be from the Client Computer with an IP of 10.10.10.40 to the Hardware Load Balancer’s Virtual IP Address of 10.10.10.6.

We now see SNAT at work.  When the traffic is sent to the server, we see the Source IP is defined as the Hardware Load Balancer’s Self IP.  This means that the Server will see the Source IP as 10.10.10.5 instead of 10.10.10.40 and because of that, the server will return the traffic to 10.10.10.5 instead of 10.10.10.40.

As we can see, the Server will respond to 10.10.10.5 with the requested data instead of responding to 10.10.10.40.

We finally see the VIP respond to the Client computer with the data the server had returned to the hardware load balancer.

How does SNAT relate to the Lync Server 2010 Edge?

Let’s take a look at some F5 documentation on the external interface of a Lync Server 2010 Edge Server.  You can see the F5 BIGIP LTM 10.2.2 documentation here.  As of the writing of this blog article, the document revision is 1.9.  Looking at Page 7 (may be different in a future document revision), we can see the section entitled, “Configuration table for BIG-IP objects: Edge Servers – External Interface.”  Take a look at the SNAT requirements for each Lync Server 2010 Edge Role.  You will see that SNAT is enabled for the Access Edge Role and the Web Conferencing Edge Role.  SNAT is NOT enabled for the Lync Server 2010 Audio/Video Edge Role.

What does this mean?  This means that the Audio/Video Edge Server will respond directly to the client on the Internet and will not return Audio/Video or even Desktop Sharing traffic through the Hardware Load Balancer.  In fact, if we take a look at the Planning Documentation for Lync Server 2010, we can see that ports required for Audio/Video need to be opened to and from  the Lync Server 2010 Audio/Video Edge role directly as well as to the Hardware Load Balancer Virtual IP Address (VIP) belonging to the Audio/Video role.  We can see this in the following diagram provided by Microsoft for the External Topology with Hardware Load Balancing.

Putting it all together

Now with that said.  Let’s put all this information together to really understand why we must use a Public IP address on the Edge Server’s Audio/Video IP.  Because the Lync Audio/Video Edge based on the ICE (TURN/STUN) must be able to respond directly to clients with its own Public IP Address which means it needs to respond directly to clients. If we were to use SNAT, this means the Audio/Video would respond through the Hardware Load Balancer and Audio/Video traffic would overload a Hardware Load Balancer and potentially provide a degredated experience. On top of that, if the UDP packets were to go through a hardware load balancer, you’d run into UDP issues as described such as the fact that UDP is a stateless protocol, so some load balancers distribute UDP packets to the servers without any context for the current session.  By having a Public IP on the Edge and not using SNAT on the Hardware Load Balancer for the Audio/Video IP, it allows the Hardware Load Balancer to respond to the client to respond directly to clients negating the UDP problems mentioned and allowing a better Audio/Video experience with the client being able to talk directly to the Lync Server 2010 Edge Server’s Audio/Video Public IP.

Now, I’m sure you are asking the question, well if I’m using DNS Load Balancing, why can I use a NAT’d IP for the Lync Server 2010 Edge Server?  Well, it’s pretty simple.  We have no Hardware Load Balancer in the mix.  When we configure DNS Load Balancing, it allows us to specify the Public IP of our Lync Server 2010 Edge Server’s Publicly Routable IP Address.  Because of this, the Lync Server 2010 will forcefully respond with its Public IP Address instead of the NAT’s Private IP Address.  We can’t do this with a Hardware Load Balancer because we can’t NAT the connections twice; once to a Private IP on the HLB Audio/Video VIP and another to the Private IP of the Audio/Video IP.  So we elect to have Public IP Addresses on both the VIP and the Server, the traffic is not NAT’d, and the Audio/Video Role can reply with its Public IP and then clients will then begin to send communication directly to the Lync Server 2010 Edge Server.

Share

Using Lync 2010 Mobility on your Corporate WIFI Networks

Lync 2010 Mobility has been out for a few months now.  Jeff Schertz has a great writeup on Lync Mobility on his blog here.  What I wanted to go into is some more detail on deploying Lync Mobility on your corporate wifi networks which I haven’t seen documented in very good detail on Technet or other blog articles.  Now keep in mind, this blog article is for deploying Lync Mobility on your corporate wifi networks, not your guest wifi networks.  Basically, any wifi network that can access your Front End Servers.

There are two considerations to take when deploying Lync 2010 Mobility on your Corporate WIFI Network that you need to be cognizant of after reading Jeff Schertz’s Mobility article and the official Mobility documentation:

  • Certificates
  • Talking to the External Mobility Services

Certificate Issues for Lync 2010 Mobile Clients Connecting over Corporate WIFI

If we take a look at Jeff’s article or the official Lync Mobility document, we can see that there is an FQDN we add to internal DNS:

  • lyncdiscoverinternal.domain.com

The basic process for how the Lync 2010 Mobile Client will connect to Mobility Services while on a corporate WIFI network is as follows:

As we can see by the above, the Lync 2010 Mobile client does a lookup for lyncdiscoverinternal.domain.com.  It is because of this, the Lync Mobile documentation has us replace the certificate on our Front End Servers.  Now with that said, that means that any request to Lyncdiscoverinternal.domain.com will eventually terminate (SSL termination) against our Front End Server.  Now in the majority of deployments, the Front End Servers and Internal Edge NIC will have certificates signed by your internal certificate authority.

Now with that said, that means that Mobile clients will have some issues with connectivity to Lync 2010 Mobile Services as lyncdiscoverinternal.domain.com would be signed by your internal certificate authority.  Domain-Joined machines will automatically have most likely have a copy of your Root Certificate Authority’s self-signed certificate.  If your Root Certificate Authority is an Enterprise Root CA, it automatically publishes its certificate to Active Directory.  When domain joined machines sign into AD, they will install these certificates.  For Standalone Root CA’s, you have probably used Group Policy to publish your Root/Intermediate certificates or used certutil -dspublish.  The issue here is that these Mobile Devices do not have a copy of your internal Certificate Authority’s certificate.  Thus, they will have certificate/connectivity problems when on the WIFI Network.

My experience at a previous client with the different mobile devices have shown the following results:

  • Windows Phone 7: Seemed to function even without the root certificate.  The WP7 seem to employ some kind of silent fallback mechanism to connect to the external network and attempt to find the external web services name.
  • IOS 5: Retrieved an error that we could not connect to the server without any certificate warning.  It just would not connect to the server. After importing the root certificate on the IOS device, we could connect without any issue.
  • Android: Retrieved a certificate warning.  We were presented with a connect button on the bottom left which allowed the user to connect regardless of the warning/error that they received about not trusting the server they were connecting to.

Now, these certificate warnings may be unacceptable to your organization.  If they are, you will want to replace your Front End/IIS Certificate(s) with a certificate from a Public Certificate Authority.  Keep in mind you will want to replace the internal Edge Server’s certificate with a Public Certificate as well.  I have seen issues where if the Front End and Internal Edge had certificates from different CAs, they would stop replicating with each other.  This bug may have been fixed as this happened several months ago when Lync 2010 was still relatively new.

However, other than having Public Certificates in the entire infrastructure, there is another method.

How to prevent certificate errors and still utilize internal certificates on your internal Lync 2010 infrastructure

There is a method you can use to get Lyncdiscoverinternal.domain.com to function without needing to configure your Lync 2100 Front End Servers and Lync 2010 Edge Server’s Internal NIC with a public certificate. Another method in which you can use to prevent certificate errors is by having all LyncDiscoverInternal.domain. requests go to your Reverse Proxy which will use a Public Certificate.  By taking a look at the Lync Mobility documentation, we can see that both 80 and 443 can be used to service Lync Autodiscover Mobile requests.  Because of this, we can have TMG also service LyncDiscoverInternal.domain.com requests.  A couple options here would be to:

  • On the Web Services rule for Lync 2010 which handles Simple URLs and the External Web Services FQDN, we can add all LyncDiscover.domain.com FQDNs (one for each SIP Domain) as well as all LyncDiscoverInternal.domain.com (one for each SIP Domain).
  • Create a new Web Listener and Web Services rule for Lync 2010 Mobile Autodiscover requests that handle Lync 2010 Autodiscover Only.  This Web Listener will listen on port 80.  The Web Listener will bridge to 8080 on the Front End Server or Hardware Load Balancer that services the Lync 2010 Pool.  The Mobile Client, as stated earlier, will attempt both HTTP and HTTPs for Autodiscover.  Because the Autodiscover FQDNs will point to the Reverse Proxy (ISA/TMG), HTTP will work for Autodiscover and the client will successfully connect.

In taking a look at the following diagram that is provided in the Lync 2010 Planning Documentation, the DNS record on the far right, lyncdiscoverinternal.contoso.net would point to the NIC on the Reverse Proxy Server.  This would require you to ensure that internal communications over either 80 or 443 (depending on which scenario above is used) so autodiscover requests from the Lync 2010 Mobile client on WIFI networks function properly.

To verify that LyncDiscoverInternal.domain.com functions properly while on the Internal WIFI Network, connect to the WIFI Network and use the following Autodiscover URL to test Autodiscover Connectivity:

https://lyncdiscoverinternal.domain.com/autodiscover/autodiscoverservice.svc/root/domain

The following Autodiscover results are provided back to Internet Explorer.  As you can see, it provides Redirect Information on where the client should now connect to make a successful Autodiscover Request:

{“AccessLocation”:”Internal”,”Root”:{“Links”:[{“href”:”https:\/\/InternalWeb.domain.com\/Autodiscover\/AutodiscoverService.svc\/root\/domain”,”token”:”Domain”},{“href”:”https:\/\/InternalWeb.domain.com\/Autodiscover\/AutodiscoverService.svc\/root\/user”,”token”:”User”},{“href”:”https:\/\/InternalWeb\/Autodiscover\/AutodiscoverService.svc\/root\/oauth\/user”,”token”:”OAuth”}]}}

We will use the following new URL to see the entire Autodiscover result:

https://InternalWeb.domain.com/Autodiscover/AutodiscoverService.svc/root/domain

This provides us with the new following results.  As you can see, the MCX URL we use is the External Web Services FQDN.  This means that even if we have a client on the internal corporate WIFI, they must connect to the external web services FQDN that is published through TMG.

{“AccessLocation”:”Internal”,”Domain”:{“Links”:[{“href”:”https:\/\/InternalWeb.domain.com\/Autodiscover\/AutodiscoverService.svc\/root”,”token”:”Internal\/Autodiscover”},{“href”:”https:\/\/InternalWeb.domain.com\/Reach\/sip.svc”,”token”:”Internal\/AuthBroker”},{“href”:”InternalWeb.domain.com\/Ucwa\/Discovery”,”token”:”Internal\/Ucwa”},{“href”:”https:\/\/ExternalWeb.domain.com\/Mcx\/McxService.svc”,”token”:”Internal\/Mcx”},{“href”:”https:\/\/ExternalWeb.domain.com\/Autodiscover\/AutodiscoverService.svc\/root”,”token”:”External\/Autodiscover”},{“href”:”https:\/\/ExternalWeb.domain.com\/Reach\/sip.svc”,”token”:”External\/AuthBroker”},{“href”:”https:\/\/ExternalWeb.domain.com\/Ucwa\/Discovery”,”token”:”External\/Ucwa”},{“href”:”https:\/\/ExternalWeb.domain.com\/Mcx\/McxService.svc”,”token”:”External\/Mcx”}],”SipClientExternalAccess”:{“fqdn”:”sip15.ms.cdw.com”,”port”:”443″},”SipClientInternalAccess”:{“fqdn”:”LyncPool.domain.com”,”port”:”5061″},”SipServerExternalAccess”:null,”SipServerInternalAccess”:{“fqdn”:”LyncPool.domain.com”,”port”:”5061″}}}

Another way to look at the Autodiscover response is by taking a look at the Lync client’s Mobility Diagnostic Log.  For information on how to view these diagnostic logs, please see Randy Wintle’s article here.  The following XML data will be seen which is formatted a bit differently than viewed above:

<?xml version=”1.0″ encoding=”utf-8″?><AutodiscoverResponse AccessLocation=”Internal” xmlns:xsd=”http://www.w3.org/2001/XMLSchema” xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance”><Domain><SipServerInternalAccess fqdn=”LyncPool.domain.com” port=”5061″/><SipClientInternalAccess fqdn=”LyncPool.domain.com” port=”5061″/><SipClientExternalAccess fqdn=”sip.domain.com” port=”443″/><Link token=”Internal/Autodiscover” href=”https://InternalWeb.domain.com/Autodiscover/AutodiscoverService.svc/root”/><Link token=”Internal/AuthBroker” href=”https://InternalWeb.domain.com/Reach/sip.svc”/><Link token=”Internal/Ucwa” href=”https://InternalWeb.domain.com/Ucwa/Discovery”/><Link token=”Internal/Mcx” href=”https://ExternalWeb.domain.com/Mcx/McxService.svc”/><Link token=”External/Autodiscover” href=”https://ExternalWeb.domain.com/Autodiscover/AutodiscoverService.svc/root”/><Link token=”External/AuthBroker” href=”https://ExternalWeb.domain.com/Reach/sip.svc”/><Link token=”External/Ucwa” href=”https://ExternalWeb.domain.com/Ucwa/Discovery”/><Link token=”External/Mcx” href=”https://ExternalWeb.domain.com/Mcx/McxService.svc”/></Domain></AutodiscoverResponse>

All Lync Mobile clients must connect through the external web services FQDN

So by now, we realize that a Mobile Client on the corporate network must connect through external web services.  This means we must do the following:

  • Create our external web services FQDN (ExternalWeb.domain.com) in our internal DNS infrastructure that mobile clients resolve against.  This FQDN will point to the Public IP of External Web Services.  Essentially, the DNS record created in External DNS and the DNS record created in Internal DNS will be identical.
  • Allow our mobile client connected to WIFI to connect to external web services.  This will be done by hairpinning.  Essentially, this means if the mobile client when connected to WIFI must connect out to the 131.x.x.50 (in this example, 131.x.x.50 will be our external web services IP pointed to the external interface of TMG) and then back into the NAT’d IP of TMG without completely going out to the internet.  Thus the traffic is hairpinned.

Now I’m sure the following question is going through your head: Why must we have all mobility services connect to the External Web Services FQDN and why aren’t we using the Pool and Edge Server just like the Lync 2010 client installed on Desktop Operating Systems?  There are a couple answers to this question:

  • SIP protocol by nature has long hold times.  HTTP protocol by nature has short hold times.  Mobile clients these days have the ability to switch between WIFI and cellular networks in a very fast if not seamless manner.  By having Lync 2010 Mobile clients use HTTP which have short hold times, Lync 2010 Mobile Clients can mantain connectivity during this WIFI to cellular (and vice versa) transition.
  • The reason why we always want to connect to external web services is because now that we understand why we are using HTTP based on the above bullet, we want to maintain connectivity to the same location to ensure a faster/smoother transition between WIFI and cellular (and vice versa) networks.  We must also maintain the same persistence while maintaining these connections.  Having the clients connect to the same place and maintaining affinity (if using HA and a certificate for cookie based affinity on the HLB) we can maintain affinity from your Mobile Client to the Reverse Proxy to the Hardware Load Balancer and then to the Front End Pool Servers.

An Alternative Way to connect to external web services without the use of hairpinning (Less Preferable than Hairpinning)

Let me start this by saying, this method is not reconnected due to extra traffic and burden on your bandwidth and DNS Servers.  If for whatever reason, you cannot hairpin the traffic so the internal WIFI network can communicate to the external web services public IP address, would be to point the external web services FQDN that is located in Internal DNS to the Internal IP address of your Reverse Proxy Server.  With this mechanism, when the Mobile Client while connected to WIFI gets the external Web Services FQDN while on Internal DNS, they will get a private IP response and connect to Reverse Proxy in that fashion.  When an internet connected mobile device gets the Autodiscover Response and does a DNS lookup, they will receive the Public IP address of External Web Services.

Now if you have read the bottom 2 bullets in the section entitled, “All Lync Mobile clients must connect through the external web services FQDN” you will understand that this method goes against the Mobility model.  One of the ways to alleviate issues when switching between WIFI and cellular networks (and vice versa) would be to change the External Web Services FQDN (in both internal and external DNS) to have a lower TTL value or even a TTL value of 0.  This way, when a mobile client switches from WIFI to cellular (or vice versa), they will do a new DNS lookup since the TTL value is 0 and find the new IP address and successfully connect.  This will obviously not be a seamless transition but it does provide a method of being able to reestablish a connection.  But, this also means that Mobile Clients and all other Lync 2010 Clients (Phone Edition, desktop client, etc.) will constantly have to do DNS lookups which will now cause more network utilization as well as DNS Server Utilization.  So if it is decided this is the roue that will be taken, be sure to be aware of the negative ramifications that this ensure.

 

Share

Enabling QoS for Lync Server 2010 – Part 2

Welcome to Part 2 on how to Enable QoS for Lync Server 2010. The purpose of this multi-part article (first part for QoS on Lync Client and second part for QoS on Lync Server) is to lay everything out in a concise manner to help you, the reader, understand how to enable QoS.  Keep in mind that this article is only for the ability to enable QOS, it is not a comprehensive guide on all the various dynamic ports available in Lync to lock down your firewalls.  For that, you can check out my other article here. Second of all, the question may arise, why and when would you want to enable QoS.  Audio and Video are synchronize traffic that can be affected by jitter, delay, and packet loss on an IP Network.  Lync has been designed to work without QoS but Lync Administrators can choose to enable both Lync endpoints as well as servers to mark Differentiated Services Code Point (DSCP) values on audio and video packets.  This ensures that audio/video packets get prioritized on a network that is enabled for Differentiated Services (DiffServ).

To better understand DiffServ and its affect on the network, please check out the excellent blog article written by fellow Lync MVP Jeff Schertz at the following URL: http://blog.schertz.name/2011/08/lync-qos-behavior/

Part 1

Part 2

Server QOS

General Procedure for Server QoS

In Part 1, we talked about Windows Vista/7 vs Windows XP.  Windows 7 and Windows Vista utilize Policy based QoS and Windows XP used QoS based on the Packet Scheduler.  For Lync Servers, you’ll always use Policy based QoS since Lync Server 2010 can only be installed on Windows 2008 or Windows 2008 R2 which both utilize Policy based QoS.  For Server based QoS, we can configure Conferencing Servers, Application Servers, and Edge Servers (which will use QoS based on the destination port rather than the source port as everything else does).

Client to Server Port Configuration for Conferencing Servers and Application Servers

Client to Server Port ranges are out of the box different for all modalities except for Application Sharing. The default ports for a Conferencing Server are as such:

  • Audio: 49152 to 57500
  • Video: 57501 to 65535
  • Application Sharing: 49152 to 65535

At least 40 ports minimum are required for Application Sharing.  We will specify a 8,348 port range that is unique from other ports.  Ultimately, we will set Application Sharing to use the following ports:

  • Application Sharing: 40803 to 49151

To set this, we will run the following command:

Set-CsConferenceServer -Identity <ConferenceServer:FQDN of Lync Pool or A/V Server/Pool FQDN> -AppSharingPortStart 40803 -AppSharingPortCount 8348

Configuring an Application Server is identical.  The only difference is that you use the Set-CSApplicationServer command instead of the Set-CSConferenceServer.  Make sure to include these ports in the QoS Policies for Edge Servers as you will learn later.

Client to Server Port Configuration for Dedicated Mediation Servers

A Mediation Server of course only handles Audio since it’s job is to transcode RTAudio to G.711.  The default ports for a Mediation Server are as such:

  • Audio: 49152 to 57500

No Changes to this port range will be required.  If the Mediation Server is collocated on a Front End Server, no changes will need to be done as you can see the Audio Port Range for a dedicated Mediation Server is the same as the Audio Port Range for a Front End Conferencing Server.

Edge Server Policy Configuration

An Edge Server doesn’t get configured per se.  But the policy that you create is based on a destination port (rather than source port like client peer to peer or client to server).  The destination port configuration in the QoS Policy is configured based on the client peer to peer ports you defined in Part 1 of this article series as well as the client to server ports you defined in this Part 2 of this article series.

So if we take a look at everything we’ve done so far, we have the following peer to peer configuration from Part 1 of this article series:

  • Audio: 20000 to 20039
  • Video: 20040 to 20079

And we have the following client to server configuration from Part 2 of this article series:

  • Audio: 49152 to 57500
  • Video: 57501 to 65535
  • Application Sharing: 40803 to 49151

The Edge QoS Policy will need to have several QoS Policies configured to handle each modality (Application Sharing not as critical as Audio/Video but can be enabled) for peer to peer (Audio/Video) and client to server (Audio/Video).  Additional QoS Policies may be needed depending on Application Servers in the environment and whether they have any different port ranges from your Peer to Peer or Client to Peer port configurations.

Configuring Policy Based QOS in Group Policy for Windows 2008 and/or Windows 2008 R2 for a Conferencing Server

As stated previously, Lync Server 2010 can only be installed on Windows 2008 or Windows 2008 R2.  Both Windows 2008 and Windows 2008 R2 utilize Policy Based QOS which allows a wider variety of options for configuring QoS.

In the below example, we will show how to create the Policy-based QoS for Audio.  Once finished, be sure to also create Policy-based QoS policies for Video.  The DSCP Value for Audio will be 46 and the DSCP Value for Video will be 34. Open up Group Policy (in my examples, I am using Local Computer Policy but in a real production environment you would be using Group Policy at some level in your Domain Hierarchy) and navigate to Computer Configuration > Windows Settings > Policy-based QoS Right-Click and choose Create new policy.

In the new Policy, give it a name and specify the DSCP Value.  DSCP Values for audio is typically 46.  Make sure the Outbound Throttle Rate check box is cleared.  Click Next.

Because there are multiple applications that will stamp DSCP Values, we will choose All Applications. Click Next.

On the following screen, make sure you leave the defaults as “Any source IP address” and “Any destination IP Address.”  Click Next.

On  the following screen, choose TCP and UDP.  In our information above we stated the default audio port range is 49152 to 57500 and does not need to be changed.  Because of this, our source port range will 49152 to 575000 specified as 49152:57500.

Let’s go ahead and set the DSCP Value for Video with a DSCP value of 34. Right-Click Policy-based QoS and choose Create new policy. In the new Policy, give it a name and specify the DSCP Value.  DSCP Values for video is typically 34.  Make sure the Outbound Throttle Rate check box is cleared.  Click Next.

Because there are multiple applications that will stamp DSCP Values, we will choose All Applications. Click Next.

On the following screen, make sure you leave the defaults as “Any source IP address” and “Any destination IP Address.”  Click Next.

On  the following screen, choose TCP and UDP.  In our information above we stated the default video port range is 57501 to 65535 and does not need to be changed.  Because of this, our source port range will 57501 to 65535 specified as 57501:65535.

If you would like Client to Server QoS for Application Sharing, feel free to also create a new QoS Policy that provides DSCP Values for the port ranges specified for Application Sharing.  If you made this port range contiguous with Video, feel free to modify your Video QoS Policy to add the ports for Application Sharing if you are fine with also using a DSCP value of 34.

Now go ahead and restart your Lync Conferencing Servers so they pick up the changes. After Group Policy have applied the settings, you should see the following settings within the registry:

Configuring Policy Based QOS in Group Policy for Windows 2008 and/or Windows 2008 R2 for a Dedicated Mediation Server

As stated previously, Lync Server 2010 can only be installed on Windows 2008 or Windows 2008 R2.  Both Windows 2008 and Windows 2008 R2 utilize Policy Based QOS which allows a wider variety of options for configuring QoS.

In the below example, we will show how to create the Policy-based QoS for Audio only.  The DSCP Value for Audio will be 46. Open up Group Policy (in my examples, I am using Local Computer Policy but in a real production environment you would be using Group Policy at some level in your Domain Hierarchy) and navigate to Computer Configuration > Windows Settings > Policy-based QoS Right-Click and choose Create new policy.

In the new Policy, give it a name and specify the DSCP Value.  DSCP Values for audio is typically 46.  Make sure the Outbound Throttle Rate check box is cleared.  Click Next.

Since this is Policy-based QoS, we will want to take advantage of only tagging traffic that the Mediation Server uses utilizing the executable MediationServerSvc.exe.  So make sure you choose the “Only applications with this executable name” and specify MediationServerSvc.exe. Click Next.

On the following screen, make sure you leave the defaults as “Any source IP address” and “Any destination IP Address.”  Click Next.

On  the following screen, choose TCP and UDP.  In our information above we stated the default audio port range is 49152 to 57500 and does not need to be changed.  Because of this, our source port range will 49152 to 575000 specified as 49152:57500.

Now go ahead and restart your Lync Mediation Servers so they pick up the changes. After Group Policy have applied the settings, you should see the following settings within the registry:

 

Configuring Policy Based QOS in Group Policy for Windows 2008 and/or Windows 2008 R2 for an Edge Server

As stated previously, Lync Server 2010 can only be installed on Windows 2008 or Windows 2008 R2.  Both Windows 2008 and Windows 2008 R2 utilize Policy Based QOS which allows a wider variety of options for configuring QoS.

In the below example, we will show how to create the Policy-based QoS for Audio.  Once finished, be sure to also create Policy-based QoS policies for Video.  The DSCP Value for Audio will be 46 and the DSCP Value for Video will be 34. Open up Group Policy (in my examples, I am using Local Computer Policy but in a real production environment you would be using Group Policy at some level in your Domain Hierarchy) and navigate to Computer Configuration > Windows Settings > Policy-based QoS Right-Click and choose Create new policy.

In the new Policy, give it a name and specify the DSCP Value.  DSCP Values for audio is typically 46.  Make sure the Outbound Throttle Rate check box is cleared.  Click Next.

Since this is Policy-based QoS, we will want to take advantage of only tagging traffic that the Edge Server uses utilizing the executable MediaRelaySvc.exe.  So make sure you choose the “Only applications with this executable name” and specify MediaRelaySvc.exe. Click Next.

Update (2/28/12) – I was informed that there is a bug and packets are not being stamped with DSCP if you specify MediaRelaySvc.exe. The documentation has you specifying MediaRelaySvc.exe but I have been informed that by specifying MediaRelaySvc.exe causes QoS on Edge to not work.

On the following screen, make sure you leave the defaults as “Any source IP address” and “Any destination IP Address.”  Alternatively, you can change the Source IP Address to the internal IP of your Edge.  Click Next.

On  the following screen, choose TCP and UDP.  In our information above we stated the default audio port range is 49152 to 57500 and does not need to be changed.  Because of this, our source port range will 49152 to 575000 specified as 49152:57500.

I will not display the remainder of the QoS Policy configuration for the Edge as I’m sure by now, you are a master at configuring QoS Policies for Lync.  The remainder of the three QoS Policies will look as such:

Peer to Peer Video:

  • Policy Name: Lync Edge Peer to Peer Video
  • DSCP Value: 34
  • Only applications with the following executable name: MediaRelaySvc.exe
  • Specify Outbound Throttle Rate is Unchecked
  • Source IP: Your Internal Edge IP (Our example is 10.10.10.50/32)
  • Destination Port Range of 20040:20079

Client to Server Audio:

  • Policy Name: Lync Edge Conferencing Audio
  • DSCP Value: 46
  • Only applications with the following executable name: MediaRelaySvc.exe
  • Specify Outbound Throttle Rate is Unchecked
  • Source IP: Your Internal Edge IP (Our example is 10.10.10.50/32)
  • Destination Port Range of 49152:57500

Client to Server Video:

  • Policy Name: Lync Edge Conferencing Video
  • DSCP Value: 34
  • Only applications with the following executable name: MediaRelaySvc.exe
  • Specify Outbound Throttle Rate is Unchecked
  • Source IP: Your Internal Edge IP (Our example is 10.10.10.50/32)
  • Destination Port Range of 57501:65535

After all QoS Policies are created, reboot the Lync Edge Server.  You should see the following registry changes:

Share

Enabling QoS for Lync Server 2010 – Part 1

There’s a doc available by Microsoft on how to enable Quality of Services (QoS) in Lync which you can find here.  The purpose of this multi-part article (first part for QoS on Lync Client and second part for QoS on Lync Server) is to lay everything out in a concise manner to help you, the reader, understand how to enable QoS.  Keep in mind that this article is only for the ability to enable QOS, it is not a comprehensive guide on all the various dynamic ports available in Lync to lock down your firewalls.  For that, you can check out my other article here. Second of all, the question may arise, why and when would you want to enable QoS.  Audio and Video are synchronize traffic that can be affected by jitter, delay, and packet loss on an IP Network.  Lync has been designed to work without QoS but Lync Administrators can choose to enable both Lync endpoints as well as servers to mark Differentiated Services Code Point (DSCP) values on audio and video packets.  This ensures that audio/video packets get prioritized on a network that is enabled for Differentiated Services (DiffServ).

To better understand DiffServ and its affect on the network, please check out the excellent blog article written by fellow Lync MVP Jeff Schertz at the following URL: http://blog.schertz.name/2011/08/lync-qos-behavior/

So, let’s dive into my version of how to enable QoS.  Shall we?

Part 1

Part 2

Client QOS

Windows 7 versus Windows XP

Windows Vista and Windows 7 utilize Policy based QOS. Policy based QOS has the benefit that you can restrict the QoS application at the application level.  For Lync, this would be communicator.exe. Windows XP uses separate QOS Group Policy Options that do not allow you to restrict the DSCP values at the application level.  This means that all applications that utilize the Audio/Video ports we configure for Audio/Video will get DSCP markings stamped.

Peer to Peer Port Configuration

All client port ranges need to be changed as they are all overlapping by default.  Client Media traffic by default utilizing ports 1024 to 65535 when doing Peer to Peer. To specify the client media port ranges, Set-CSConferencingConfiguration must be used. The port ranges for each modality must not conflict with another modality. Also, it is highly recommended to ensure that when each modality is locked down to its own port range that all ports are contiguous as this will make configuring Group Policy later on a bit easier as you will see later on in the article.

The command used to enable the ability to lock down peer to peer client ports is Set-CsConferencingConfiguration with the ClientMediaPortRangeEnabled set to 1.  When enabled, clients will use the specified port range for media traffic. When disabled (the default value) any available port (from port 1024 through port 65535) will be used to accommodate media traffic.  Because we want to lock down the peer to peer ports, we must run the following command:

Set-CsConferencingConfiguration -ClientMediaPortRangeEnabled 1

Once this command is run, we can go ahead and start locking down our ports.  Now keep in mind, all these commands are provided to the clients via in-band provisioning.  This means that once our client signs in, they will start using these locked down port ranges and it does not require any Group Policy Object to be created (at least not for locking down ports) and pushed down to your clients.

The following commands are where we finally choose the amount of ports and at what port each modality starts.  The commands are:

  • Application Sharing:
    Set-CSConferencingConfiguration -ClientAppSharingPort <beginning of port range (5350 by default)> -ClientAppSharingPortRange <extent of port range, at least 4 (40 by default)>
  • Audio:
    Set-CSConferencingConfiguration -ClientAudioPort<beginning of port range> -ClientAudioPortRange <extent of port range, at least 20 (40 by default)>
  • Video:
    Set-CSConferencingConfiguration -ClientVideoPort <beginning of port range> -ClientVideoPortRange <extent of port range, at least 20 (40 by default)>
  • File Transfer:
    Set-CSConferencingConfiguration -ClientFileTransferPort <beginning of port range> -ClientFileTransferPortRange <extent of port range, at least 20 (40 by default)>
  • Communicator 2007 R2:
    Set-CSConferencingConfiguration -ClientMediaPort <beginning of port range> -ClientMediaPortRange <extent of port range, at least 40>

Note: -ClientMediaPortRange is used for Office Communicator 2007 R2 Clients. The reason why this uses 40 is because this setting includes all modalities as Office Communicator 2007 R2 did not split apart each modality into their own separate switches.  Being able to break up each modality is a feature of Lync.

An example of a properly defined command with the minimum port requirement in one big switch is as follows:

Set-CsConferencingConfiguration -ClientAudioPort 20000 -ClientAudioPortRange 20 -ClientVideoPort 20020 -ClientVideoPortRange 20 -ClientAppSharingPort 20040 -ClientAppSharingPortRange 4 -ClientFileTransferPort 20044 -ClientFileTransferPortRange 4 -ClientMediaPort 20048 -ClientMediaPortRange 40

An example of a properly defined command with the default port range is as follows (this is the example we will use going forward when configuring Group Policy):

Set-CsConferencingConfiguration -ClientAudioPort 20000 -ClientAudioPortRange 40 -ClientVideoPort 20040 -ClientVideoPortRange 40 -ClientAppSharingPort 20080 -ClientAppSharingPortRange 40 -ClientFileTransferPort 20120 -ClientFileTransferPortRange 40 -ClientMediaPort 20160 -ClientMediaPortRange 40

Configuring Policy Based QOS in Group Policy for Windows Vista and/or Windows 7 clients

As stated previously, Windows Vista and Windows 7 clients utilize Policy Based QOS which allows a wider variety of options for configuring QoS.  For example, you can specify that only communicator.exe should tag x ports.

In the below example, we will show how to create the Policy-based QoS for Audio.  Once finished, be sure to also create Policy-based QoS policies for Video.  The DSCP Value for Audio will be 46 and the DSCP Value for Video will be 34. Open up Group Policy (in my examples, I am using Local Computer Policy but in a real production environment you would be using Group Policy at some level in your Domain Hierarchy) and navigate to Computer Configuration > Windows Settings > Policy-based QoS Right-Click and choose Create new policy.

In the new Policy, give it a name and specify the DSCP Value.  DSCP Values for audio is typically 46.  Make sure the Outbound Throttle Rate check box is cleared.  Click Next.

Since this is Policy-based QoS, we will want to take advantage of only tagging traffic that communicator.exe uses.  So make sure you choose the “Only applications with this executable name” and specify communicator.exe. Click Next.

On the following screen, make sure you leave the defaults as “Any source IP address” and “Any destination IP Address.”  Click Next.

On  the following screen, choose TCP and UDP.  In our example above we used the Set-CSConferencingConfiguration command with the ClientAudioPort 20000 -ClientAudioPortRange 40 switches.  Because of this, our source port range will 20000 to 20039 specified as 20000:20039 since our ClientAudioPortRange was 40.

Let’s go ahead and set the DSCP Value for Video with a DSCP value of 34. Right-Click Policy-based QoS and choose Create new policy. In the new Policy, give it a name and specify the DSCP Value.  DSCP Values for video is typically 34.  Make sure the Outbound Throttle Rate check box is cleared.  Click Next.

Since this is Policy-based QoS, we will want to take advantage of only tagging traffic that communicator.exe uses.  So make sure you choose the “Only applications with this executable name” and specify communicator.exe. Click Next.

On the following screen, make sure you leave the defaults as “Any source IP address” and “Any destination IP Address.”  Click Next.

On  the following screen, choose TCP and UDP.  In our example above we used the Set-CSConferencingConfiguration command with the ClientVideoPort 20040 -ClientAudioPortRange 40 switches.  Because of this, our source port range will 20040 to 20079 specified as 20040:20079 since our ClientVideoPortRange was 40.

Now go ahead and restart your Lync clients so they pick up the changes. After Group Policy have applied the settings, you should see the following settings within the registry:

Also, if you are in Workgroup Mode and notice that DSCP Values are not being applied, you may have to apply the following registry key:

Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\Tcpip\QoS]”Do not use NLA”=”1″

Configuring QOS Policies in Group Policy for Windows XP clients

As stated previously, Windows XP Clients (it’s the same for Windows Server 2003) cannot use policy-based QoS.  Instead, it uses QoS Policies based on the QoS Packet Scheduler.  To install the QoS Packet Scheduler on Windows XP or Windows Server 2003, please proceed with the following steps:

Go to Control Panel > Network Connections > Right-Click Network Interface > Choose Properties. Then Choose Install.

Make sure to choose Service.  Click Add.

Choose QoS Packet Scheduler as the Network Service.  Click OK.

Now it is time to go into Group Policy. The DSCP Value for Audio will be 46 and the DSCP Value for Video will be 34. Open up Group Policy (in my examples, I am using Local Computer Policy but in a real production environment you would be using Group Policy at some level in your Domain Hierarchy) and navigate to Computer Configuration > Administrative Templates  > Network > QoS Packet Scheduler.

The section we will be working with is, “DSCP value of conforming packets.”  You do not need to modify “DSCP value of non-conforming packets.” And the two options within “DSCP value of conforming packets” we will be working with is:

  • Controlled load service type (For Video with a DSCP Value of 34)
  • Guaranteed service type (For Audio with a DSCP Value of 46)

Let’s go ahead and set the DSCP Value for Video (Controlled load service type).  Go ahead and open “Controlled load service type.”  Choose Enabled and set the DSCP to 34. Then click OK.

Let’s go ahead and set the DSCP Value for Audio (Guaranteed service type).  Go ahead and open “Guaranteed service type.”  Choose Enabled and set the DSCP to 46. Then click OK.

After Group Policy have applied the settings, you should see the following two settings set within the registry:

Now hop on your Lync Server and open the Lync Management Shell and type the following command:

Set-CsMediaConfiguration -EnableQoS $true

This command should set your Windows XP and/or Windows Server 2003 machine with the following registry key:

Configuring QOS for Lync Phone Edition

Configuring Lync Phone Edition QoS is really simple and there’s really only one step.  By default, the DSCP Value is set to 40 which is not typical for voice DSCP. We can see the default value by running the following:

Get-CsUCPhoneConfiguration

Identity             : Global
CalendarPollInterval : 00:03:00
EnforcePhoneLock     : True
PhoneLockTimeout     : 00:10:00
MinPhonePinLength    : 6
SIPSecurityMode      : High
VoiceDiffServTag     : 40
Voice8021p           : 0
LoggingLevel         : Off

To set this value to 46, run the following command (leaving -Identity blank will modify the global settings):

Set-CsUCPhoneConfiguration -VoiceDiffServTag 46

Surprisingly, that’s all there is to it for enabling QoS to Lync Phone Edition.  That is of course other than rebooting your Lync Phone which is required.

As an alternative to DSCP value, you can utilize 802.1p for Lync Phone edition.  This setting is effective only for networks in which switches and bridges are 802.1p-capable.  The minimum value for this property is 0 and the maximum is 7.  The default value is 0.

To enable 8021.p you can run the following command (leaving -Identity blank will modify the global settings):

Set-CsUCPhoneConfiguration -Voice8021p <value>

Conclusion

In this Part 1 on how to enable QOS for Lync Server 2010, we took a look at how to enable QOS for Lync clients.  In Part 2, we will take a look at how to enable QoS for for Lync 2010 servers.

Share

« Previous PageNext Page »