I had a very annoying error in 2013 where the Newsfeeds wouldn’t work due to issues with the Distributed Cache. After setting the ULS logs to verbose, the error message was now shedding more light indicating that the AppFabric service could not connect to net.tcp://machinename:22233. I did a lot of research around the subject but everything I tried it was hopeless the distributed cache wouldn’t start and AppFabric was reporting a service status of “down”. I found a number of fixes around this issue but none of them applied to me. After a bit more research it turned out that the distributed cache:
- Must be running on the farm account as it tries to access the farm configuration database. You can always change it but its not worth the trouble of figuring out what it needs to access and where
- During provisioning, the service account (Farm account) must be in the local administrators group. Similar to the User Profile Service. After you start the Distributed Cache service you can always take out local admin rights.
That is all good but I now have a Distributed Cache service which is in a limbo state. How can we rectify that? Lets start by removing the host out of the cache cluster entirely. To do that we need to use the SharePoint PowerShell cmdlet Remove-SPDistributedCacheServiceInstance. After this executes, you will notice that you no longer see “Distributed Cache” in the “Services on this Server” section. At this point go ahead and add the farm account to the local admins group. When you are done we need to add the service back in. We can achieve this with the cmdlet Add-DistributedCacheServiceInstance. As soon as that completes check if the service is already running in the “Services on local server” section in Central Admin and if not start it manually. If you inspect the ULS logs you will notice that the service is now doing a lot of provisioning. Give it a few minutes and it should be ready and your newsfeeds are running.
As best practice the Distributed Cache;
- Should not be executing on the same box with: Excel Services,Search Service, Project Server, SQL Server
- Must have around 40% of the total server memory allocated (Configurable via PowerShell).
- Memory allocation must be identical across all Distributed Cache Hosts
- Remote Registry must be accessibe over the firewall between the Distributed Cache Hosts.
You can read more on the Distributed Cache Service here: Plan for the Distributed Cache service