I've a strange problem and it seems to affect a fairly large amount of endpoints. I've a client with about 150,000 endpoints and a pretty good percentage (3-5%) of them have stopped updating. Taking a small subset of them I've been investigating the endpoints manually to see what is going on. What I've found is that sure, some of them have corrupt definitions or can't communicate with the SEPM and with these I remediate and they work from then on out just fine. But by far the largest amount of them do not fall into this category. They are connected to a SEPM and logs show that they are at least connecting to the SEPM daily if not more often (they're on a 2hr heartbeat). When I manually check and run SymHelp it doesn't show any corrupted definitions. Essentially, for some reason the endpoints just stop updating. If I restart the SMC service (SMC -stop/start) or run intelligent updater it fixes most of them. But why is this happening? Other than not updating everything is running fine on the endpoints. They checkin, update policy and they will even run a command to update definitions but nothing happens.
Any ideas?