Demystifying the Active Directory FSMO roles

3 04 2010

If you’ve spent any time administering Active Directory, you’ve probably come across the concept of Flexible Single Master Operations (FSMO) roles. Their introduction is arguably one of the most important but misunderstood changes to Active Directory in the last ten years.

Take a trip down memory lane

In the days of Windows NT, one may recall the Primary Domain Controller (PDC) and Backup Domain Controller (BDC) concept. The directory was structured such that every DC, whether a PDC or a BDC, had a copy of the directory database, but only the PDC could make changes to that database. The model was inefficient, negatively impacted growth and desperately needed improving if the product had any chance of surviving.

Enter Windows 2000. The Directory Service went through one of its largest scale rebuilds to date. Replication and management was significantly improved and the concept of having a multi-master directory was introduced. Although this design has been tweaked over the years, fundamentally, it has remained the same through the versions – because it works. Any DC anywhere in the domain can execute virtually any update to the directory. This scales beautifully, even on large, geographically dispersed networks with many thousands of users.

However, notice I said virtually any change. Since a change can take effect at any DC, there is the possibility that a conflicting change will be made in two locations concurrently – or before replication can occur. Active Directory must ensure these situations are accounted for. In most cases, it applies its complex Multimaster Conflict Resolution Policy, which essentially says the last change wins. However, there are several procedures which simply cannot conflict; these procedures are assigned to one of the five FSMO roles, which go on to be delegated to one or more Domain Controllers.

What are the FSMO roles?

There are nominally five roles present in the directory which reside on DCs nominated specifically by the Administrator to perform these tasks. All the roles are very important and constitute a single point of failure in all Active Directory enterprises. If you have a complex topology with more than one domain, some roles are domain-specific, so you can expect to have duplicates of some roles in every domain in the enterprise.

  • The Domain Naming Master exists once per forest – in the forest root domain – and is rarely used. It is responsible for processing the addition of new child domains, application partitions and external cross-references to the enterprise. Since the name of a child domain or application partition cannot be duplicated (it would conflict in DNS, let alone send Active Directory around the twist), the DC holding this role is the only DC with the ability to process all additions of this kind in the forest.
  • Infrastructure Master: If a user from a foreign domain within the same forest is added as a member of a compatible group in another domain, the DCs in the group’s domain must have some information about that user in its local database in order to update the member attribute of the group. To do this, it adds a special record to its database called a phantom, which contains only the foreign user’s security identifier (SID), globally unique identifier (GUID) and their distinguished name (DN). Like all objects in the database, this record is given a distinguished name tag, or DNT, an internal reference used solely in the low-level Active Directory database layer. In doing this, the directory service is able to add that user as a member of the group by referring to the phantom’s DNT, just like it would refer to a user’s own DNT if you added a user from the group’s own domain to the group. You might think of this like using a primary key in a relational database to refer to objects across tables, but not exactly, as Active Directory’s database is by no means any sort of RDBMS.

    That’s very clever, but what if something about the source user in their original domain changes? If the user is renamed, moved or deleted, the phantom in the group domain DC databases would lose its referential integrity with the source domain. This is a situation the infrastructure master aims to avoid. On a periodic basis (by default, every 2 days), the infrastructure master – an FSMO role present in every domain – compares its local database to a Global Catalog (GC) server to determine whether any changes have been made to the objects the phantoms were created to represent. A GC contains a partial replica of all objects in the forest, so replication means any GC would already know about this updated data. The phantom is then updated with new values or deleted from the domain’s database if the object has been removed from its source domain.

    In a multi-domain forest, you must either locate this role on a Domain Controller which is not a Global Catalog or, if you must locate the role on a GC, ensure all DCs in that particular domain are GCs. A GC will never create phantoms because it already knows about users from other domains. If the infrastructure master is a GC, there will never be any phantoms in its local database to compare with the global catalog data, so no updates will be made, but other non-GC DCs in the domain would gradually become outdated. If all DCs in the domain are GCs, or you only have a single-domain forest, every DC knows enough about the security principal that it does not need to create a phantom, so this role is essentially redundant.

  • Schema Master: As the name suggests, this role is the Master of the Schema, the information which contains the formal definitions of how Active Directory stores objects, what attributes are available on those objects and so on. This role exists once per forest, on a DC in the forest root domain. Any updates to the Schema must be tightly controlled, so one DC delegated as the Schema Master performs all such changes to the database. Schema updates are then replicated to other DCs on the network by standard Active Directory replication.

So far, three of the five roles have been covered. Those above are those I would consider the least critical FSMO roles in the forest. If you lose the DC delegated one or more of these roles, it’s no big deal — it may prevent a network administrator taking an action, but it will not impact the usability of the network. Losing the Domain Naming Master or Schema Master would create problems in regard to creating child domains or running schema updates, but these generally occur very rarely and checking this Operations master DC is up would be part of the planned engineering works. Similarly, losing the Infrastructure Master may cause integrity issues in the database, but given that it only runs its scan every two days in the first place, a day or two of outage will not generally cause an issue.

  • RID Master: This role is one of the two which are important to the daily operation of Active Directory. Under the glossy GUI of Windows, security principals are identified and differentiated by use of two values – a Security Identifier (SID) and a Globally Unique Identifier (GUID).A SID is an alphanumeric string which is unique throughout a forest. The SID is the actual value used internally by Windows to identify users and grant access to resources using Discretionary Access Control Lists (DACLs), for example, via the ‘Security’ tab on a file or directory. Have you ever deleted a user, recreated her, then wondered why she cannot access the same files and folders, despite having the same username? The new account would have a new SID and is therefore considered an entirely different security principal to the system.Contrary to popular belief, the username, distinguished name or full name of a user are not internal tracking mechanisms within Windows as all these values could change.The standard make up of an SID might be as follows (this SID is purely random): S-1-5-21-789336058-1123561945-725345543-10823.The nature and formation of an SID is beyond the scope of this article, but it is the very last octet (in this instance, 10823) we are interested in. This figure represents a Relative Identifier (RID), an incremental value which actually makes the SIDs unique within a domain, ensuring no two users conflict in the database. When a security principal (user, computer, group etc.) is created, the domain SID (in this instance S-1-5-21-789336058-1123561945-725345543) has the next available RID appended to the end.Each Domain Controller is initially allocated a pool of 500 RIDs. As security principals are created, RIDs are used up. The allocation of RIDs to DCs is a task delegated in the RID Master FSMO role to one DC in a domain. Placing the operation in an FSMO role ensures no DC obtains a duplicate RID pool, which would eventually lead to conflicts in SID values and a major problem in terms of SID-uniqueness within the domain.
  • PDC Emulator is the most complicated and least understood role, for it runs a diverse range of critical tasks. It is a domain-specific role, so exists in the forest root domain and every child domain. Its original conception was for backwards compatibility with legacy systems, such as Windows NT BDCs. However, the role is also responsible for keeping the domain time in sync, given that the DC holding this role in the forest root domain is the most authoritative time source in the forest. Password changes and account lockouts are immediately processed at the PDC Emulator for a domain, to ensure such changes do not prevent a user logging on as a result of multi-master replication delays, such as across Active Directory sites.It should be noted that the PDC Emulator does not act in the same fashion as a PDC on a Windows NT network. Cast your eye back to the top of this article and note the section regarding a multi-master directory — for multi-master aware applications, most updates can be made at any DC on the network. However, if an application (or Operating System) is not multi-master aware, the PDC Emulator acts as if it were the PDC on the Windows NT network. One of these older applications would most probably single out the PDC Emulator and write all its changes there.

The latter two roles are much more crucial to the daily operation of the network and could very quickly become a limiting factor in its growth, usability or even the logon process if the DC(s) holding the roles are offline for any period of time. If the RID Master is lost, impact will only be felt by the Network Administrator if a DC depletes its pool of RIDs. On busy networks, this could potentially occur in a matter of days through the creation of new security principals. However, loss of the PDC Emulator could directly affect your users — you’d better have a substantial help desk ready for a spike in call volume if this DC is down for an extended period of time. For example, with the most authoritative source of time unavailable, time skew could eventually occur between DCs and computers in the enterprise and/or domain, lending itself to Kerberos authentication errors and ultimately, failed logons. While it would not be an immediate issue to take this server offline (provided you do not have any legacy applications), this would be the role I would be most concerned about in the event of a DC failure.

Conclusion

If you are still reading, well done! This article covers several aspects of Active Directory in detail, including low-level database processes unseen at the surface – particularly via the GUI. However, FSMO roles are a crucial component of your deployment — having an understanding of the underpinning concepts will help with their placement, deployment and high availability concerns within your enterprise.

Advertisements




Configuring Windows Time for Active Directory

1 08 2009

I’ve had a few requests recently from people who were confused regarding how to configure time in their Active Directory domains – and some were playing with settings on servers and workstations to try to make things work. In this article, I’ll briefly explain how the time service works in Active Directory networks and general information on how you should go about configuring it.

For anyone not aware, all machines in an Active Directory environment automatically find a time server to sync time with. Workstations use their authenticating Domain Controller, and the DCs sync with the server holding the PDC Emulator FSMO role. In a multi-domain forest, the PDC Emulator in each child domain synchronises with a DC or the PDCe in the forest root domain. To ensure the time remains reliable across the forest, only the PDC Emulator in the forest root domain should ever sync with an external time source – this leads to only one source of time being used across the forest. The Windows Time Service blog have a great post entitled Keeping the domain on time which explains this in more detail, including a great graphic.

The Windows Time Settings

You can find the settings for the Time Service in the registry, under HKLM\SYSTEM\CurrentControlSet\Services\W32Time\Parameters. The most important value to note is the ‘Type’ string – on any domain machine other than the PDC Emulator in the forest root, this should be set to NT5DS. That name isn’t particularly descriptive; if it is set, it means the machine is finding a time server in the Active Directory hierarchy.

If it isn’t set to that, you should think about resetting the time service on that machine. To do that, run a Command Prompt as an Administrator and execute the following commands:

net stop w32time
w32tm /unregister
w32tm /register
net start w32time

Check the registry again, and the Type should now be in domain sync mode (NT5DS).

Sometimes, you may find an NTPServer key in the registry despite the Type being set to NT5DS. NT5DS doesn’t use an NTP Server, so what gives? This setting is simply left over from prior to the machine being joined to the domain, when it was in a workgroup. Provided the Type value is set correctly, the NTPServer entry can be completely ignored or even deleted. Running the above commands on a domain-joined machine will delete it automatically.

The Group Policy Settings

There are also a number of Group Policy settings for the time service. These can be found in Computer Configuration\Administrative Templates\System\Windows Time Service.

I do not encourage you to change these settings; if you have done so, you probably want to revert the policies to ‘Not Configured’. There are reasons why you may make the odd change, but in general, no changes are required and you can actually break the time sync if you do make them.

If you are interested in reading further about what they do, the Windows Time Service blog has another great page going through them: Group Policy Settings Explained.

The Forest Root PDC Emulator Settings

After a bit of a configuration reset, all your DCs, member servers and workstations should now be set to sync from the domain hierarchy. But what about the PDC Emulator in the forest root?

The fact of the matter is the PDCe doesn’t actually need to synchronise with anything. It automatically designates itself the most reliable time server in the domain and it can run quite happily like that, without ever talking to an external time server. My earlier blog post entitled Time: Reliable or accurate? describes why.

However, to have an easy life and keep your users from complaining, it is almost always a good idea to have some form of external time sync on the forest root PDC Emulator. There are a number of ways to do this – for example, an external hardware clock which syncs with GPS. However, the most common (and cheapest – free) solution is synchronising with another NTP server on the Internet. I often use the servers closest to me which participate in possibly the largest time service, the NTP Project (list of time servers). Be aware that if you are bound by SLAs (my company certainly is), by its very nature, the NTP project most probably isn’t the resource for you.

To configure the time sync on the PDCe, you need to execute the following commands. I’d strongly suggest you get a level playing field by resetting the time service using the instructions above before you start.

w32tm /config /manualpeerlist:”uk.pool.ntp.org,0x8 europe.pool.ntp.org,0x8″ /syncfromflags:MANUAL /reliable:yes /update

What’s that command doing?

That command is a rather hefty command, so you may like to know exactly what it is doing to your server. All the changes are taking place in the registry at the key I posted above; using the w32tm tool to make the configuration changes is simply much easier than doing it manually yourself.

/config causes the tool to enter configuration mode. There are a number of other modes it supports which you can find by running w32tm /?.

/manualpeerlist allows you to specify the NTP server or servers you wish to synchronise time with. In this instance, each server’s DNS name or IP address should have a comma followed by the string 0x8. This instructs Windows to send requests to this external server in client mode. If you enter multiple servers, which I suggest, put the servers in quotation marks and separate each entry with a space. The value you specify here is written back to the NTPServer value in the time service’s registry key.

/syncfromflags tells the time service where it should sync time from. You can specify two entries for this – either DOMHIER or MANUAL. The former causes the time service to synchronise with the Domain Hierarchy (sets NT5DS in the Type key in the registry) whereas the latter tells the time service to sync with the server(s) you specified in the Manual Peer List. MANUAL sets Type to NTP.

/reliable sets the server to be a reliable source of time for the domain. Strictly it isn’t required, because the PDC Emulator in the forest root is always the most reliable time server, but I like to include it anyway.

Finally, /update notifies the time service the values have changed, so the new settings are used with immediate effect. If this isn’t included, the registry is updated but the new values will only be used by the time service when its service or the server itself is restarted.

After you’ve run that command, you might want to take a look in the registry to see what changes have been made, and whether they are as you expected.

Check Time Synchronisation

You may be intrigued to know whether the time sync is working correctly. You can do this in one of two ways.

The safest is to wait for a scheduled time sync to take place, or restart the machine. Either will trigger Event ID 35 to be logged in the System log. This event’s description shows the time server the machine is synchronising with. This will be logged on both the PDC Emulator and all DCs, member servers and workstations. You can check for this on member machines to ensure a DC in the domain hierarchy is being found and used correctly – and to ensure your custom NTP servers configured on the PDC Emulator are being used as intended.

Alternatively, putting your cowboy hat on, you can force a time synchronisation. Set the time a minute or two out from what it should be, then return to the command prompt and run w32tm /resync /rediscover. After a few moments, the above event should be logged, and a healthy time service should cause the time on the system to be set back to normal.

As a note, no time synchronisation will take place if the difference between the current system time and the new time provided by the time server is too great. A minute or two is fine, but I would not set the difference to be any more than that. The system checks this difference at each sync, and will reject the new time provided by the time server if it is too large.

Conclusion

You should now have an understanding of how the time service works and where it stores its settings in the registry. While time isn’t one of the most fun services an Active Directory administrator will work with, it is important you ensure the forest stays in sync if you want to avoid major problems with time skew, Kerberos and Active Directory in general.