large company (1000+ people)
And so I came to a large international company. The very first pleasant impression is a working SSO (Single sign on) implemented using Azure Active Directory. I just set a password for the user on my computer – and I didn’t have to create or remember any more logins and passwords.
From package solutions, Office365 and G Suite are used here for some reason. This has happened historically, but at the moment it looks redundant. The only explanations that can be thought up are Vendor lock–in prevention or just a backup option if one of the vendors is blocked in any country.
As a messenger, at first there was Skype, then everyone switched to Teams. After Slack, I wouldn’t say that I feel a strong difference – there are some strange moments, but you can quickly adapt to everything. From the obvious advantages – integration with Office365 out of the box, from the minuses that are directly infuriating – you can’t write a username if this person is not in the chat or channel — WTF???
For video communication, we used the hipster BlueJeans service, then switched to the ubiquitous Webex. The reason is most likely the load that the service can hold.
For project management, engineering uses jira, and designers and part of the products use Asana because of its simplicity and neatness. Jira copes well with loads of 3000+ users, although there are situations when the cache on one of the nodes becomes invalid, and you have to change the node forcibly. For roadmapping, they used to use Jira Portfolio, then switched to BigPicture. Admins say Portfolio was buggy, but it’s hard for me to imagine something more raw and strange than BigPicture. The only plus is that everything is inside jira as a plugin. In such a situation, it would be more logical to choose Portfolio, since it is also a product of Atlassian. And of course, all the same, many product teams plan everything first in Excel.
ServiceNow was chosen as a helpdesk portal and software for managing IT tasks. The UX of this service got stuck somewhere in the early noughties, but still this tool solves its main tasks. But again — why not choose the Atlassian stack, since you bought a gira? And if we talk about the convenience of all sorts of service desks, the ideal solution is some kind of chat bot or one search bar where I write what I need in ordinary language. And no filling out kilometer forms and learning the classification of queries.
And a little more about IT. I have to put Carbon Black, Umbrella, ClearPass, GlobalProtect on the computer, and probably there is something else there. They consume an average of 30% of the computer’s resources. But this is probably justified if the company is quite large and does not shy away from buying expensive hardware.
Previously, before going on a business trip, I went to the office manager, we sat down at the computer and started choosing flights and hotels on Expedia. This is of course convenient, but this approach is absolutely not scaled, and there have been cases that someone forgot about the details and convenient flights were no longer available or visas were made end-to-end. It turned out that there are also special Tula for Travel management, for example ComBtas. He will remind you about the validity period of the visa, and the agent will pick up the tickets for you, and you can pay all the expenses and get a refund. But without hints from the first time, almost no one can fill out everything correctly, and it is unclear how to arrange a trip immediately for a group.
The company keeps all its applications and services in several data centers, between which data replication is configured. Previously, the Amazon cloud was actively used, but it was decided to abandon it for security reasons. As a result, we can install any software and experiment, but we lose speed when we wait for a new piece of hardware, or rather always two pieces of hardware. Akamai DSA is used to quickly switch between data centers, as well as to speed up the delivery of resources. For smart management of external traffic (proxy, load balancing, ssl), the datacenter uses F5 (hardware + software).
If you are offered a bunch of pre-installed monitoring tools out of the box in the cloud, then you need to install and configure everything yourself in your own datacenter. Just in case, here is a list of useful tools that help to collect and visualize metrics, predict anomalies and quickly respond to alerts:
ELK for collecting logs and plotting based on them.
Grafana for advanced visualization of information from any data sources, works well with complex queries.
New Relic or Atatus for APM, profiling PHP applications. Atatus is cheaper, there are fewer functions in it, but for the sake of a contract with a large company, they are ready to quickly implement all sorts of client’s wishes.
Appdynamics as an APM for Java applications and databases.
Prometheus for collecting custom metrics, alerting, service discovery.
Anodot for advanced anomaly and incident recognition, replacing manual monitoring with ML.
Time-series InfluxDB database and super-universal MemSQL database for storing client crashes, errors, events and load times.
Lenses.io to manage Apache Kafka.
Zipkin for tracing requests.
Zabbix for collecting metrics on hardware.
Icinga for monitoring network resources.
In the marketing descriptions on the websites of these bodies, it is written that each of them provides full monitoring of the system in one place, but in practice you have to use a whole zoo, relying on cost, convenience and killer features.
From monitoring, let’s move on to analytics tools. The Office365 package includes Power BI – quite simple and intuitive, it is used by all “non-professionals”. Cool data analysts of course use a more powerful and sophisticated Tableau. To store all analytics events, a column-oriented Vertica database is used, sharpened for a big date and all sorts of complex joins in queries. And for marketing analytics, deep links, redirects and push, AppsFlyer is used.