Views

Telecom:OSS

Contents

Navigation



Categories in Operations Support Systems



Network Management Systems - Performance and Capacity Monitoring

The Performance Management disciplines is concerned with measuring and reporting metrics affecting the behavior of applications relying on data communications. Of particular importance are:
  • The round-trip and propagation delays
  • The delay variation (also known as jitter) experienced by packets traversing the network
  • The amount of errors of all sort resulting in packets being dropped
  • The transmission speeds on network links
  • The response times of applications like web portals, database systems or IP Telephony servers
  • The amount of application-level errors, e.g. the number of failed calls in IP telephony
  • The delays or backlogs in application processes like database replication
  • The amount of time processes are delayed due to intensive I/O activity or resource shortage
  • The distribution of traffic belonging to different protocols or applications

There are different methods of measuring performance in networks:
  • Passive measurement, either by sniffing the network traffic or by relying on performance metrics computed by network devices or applications
  • Active measurement, relying on active probes generating synthetic traffic and measuring delay and losses
  • Active measurement, relying on software probes embedded in applications (e.g. script snippets in web pages), measuring application-level response times.
Maintaining between acceptable limits the performance parameters like delay, jitter and packet drop is essential for performance-sensitive applications like IP Telephony and Video Conferencing.
Performance parameters in a network are subjected to near real-time monitoring and alerts should be generated if measured metrics are outside limits considered acceptable and intervention is required. Usually periodic reporting and trend analysis are required in order to assess the health of a network or Information Technology infrastructure
The Capacity Management discipline is concerned with detecting actual or potential bottlenecks in a Network of Information Technology infrastructure, e.g. over- or under-utilization of network links or shortage of computing resources. The goal is to detect capacity utilization trends, optimize utilization of existing resources (e.g. switch ports) and plan upgrades and evergreening of the infrastructure in order to sustain utilization growth. Periodic reporting and trend analysis are required in order to assess and address capacity issues.
Specialized Network Management tools are being used for Performance and Capacity Management:
  • Passive monitoring tools - Torrus, Cricket, Cacti, Percival, OpenNMS, TStat, Infovista, eHealth, SMARTS PM
  • Active probing tools - Smokeping, IPerf, MPing
  • NetFlow, SFlow or IPFIX collection and analysis tools - nTop/nProbe, Flowscan, Flowmon, Flowtools, IPFlow, AdventNet ManageEngine
  • Embedded active probing tools - Cisco IPSLA


Network Management Systems - Availability and Fault Management

These are network monitoring tools that measure the availability and performance of applications, services, network equipment, servers, and other IT infrastructure components. Generally these tools provide web-based real-time views of monitored metrics, notify when critical conditions are met and keep a history of status changes. The availability of network elements and applications is measured through different means:
  • Actively polling (e.g.through SNMP or ICMP for network elements, through HTTP for web portals or TNS for Oracle DB)
  • Passively listening for alerts from devices (e.g. SNMP traps)

Usually the failure scenarios in networks are cascading, e.g. a core switch going down will result in a large number of unavailability alerts for devices no longer accessible.In order to reduce the number of incident-related tickets, the Availability and Fault Management do correlation and root-cause analysis, all based on topological models they generate during the network discovery phase. The correlation can be rule-based (e.g. Tivoli TEC, Netcool), model-based (e.g. Spectrum) or probabilistic (e.g. SMARTS). There are many Availability and Fault Management, the best known being:
  • HP OpenView Network Management Node
  • EMC Smarts suite
  • IBM Tivoli NetView
  • Nagios (Open Source)
  • Groundworks (Open Source)
  • ZenOSS (Open Source)
Some tool suites (e.g. ZenOSS, EMC SMARTS) would provide both Performance/Capacity Management and Availability/Fault Management.

Network Management Systems - Network Asset, Inventory and Network Change and Configuration Management (NCCM)

The Network Asset and Inventory Management practice is concerned with discovering and classifying the devices, resources and applications that exist in a network environment. This "live" inventory is often necessary for reconciliation with the administrative inventory (often kept in a CMDB), validating Install Move Add and Change (IMAC) operations and for billing purposes (e.g. inventory of switch ports or of IP phones).
The Network Change and Configuration Management practice is concerned with changing and auditing the configurations present in devices that exist in a network environment. This is often necessary for compliance with security policies and with operations management frameworks like ITIL or ISO/IEC 20000.
There are many tools able to retrieve and audit configurations from network elements, to discover and keep an inventory of Network Components with their attributes. Some NCCM tools can also automatically push configuration and firmware updates to devices.

Network Management Systems - Assessment and testing

Network assessments aim to measure performance (network- and application-level) in heavy-load conditions. Open source tools like OpenSTA or iPerf are used in assessment and performance testing in order to create synthetic network load. OpenSTA is a distributed software testing architecture for performing scripted HTTP and HTTPS heavy load tests while Iperf is a network-stress tool for measuring maximum TCP bandwidth and tuning of various network parameters Iperf reports bandwidth, delay jitter, datagram loss. In the commercial tools category, Ixia's Chariot (formerly a NetIQ product), which allows network engineers to simulate full-scale enterprise networks with a mixture of traffic types, including VoIP, Multicast, Oracle, and SAP business transactions. It also enables pre-deployment testing of networks with real-world application loads. The Proxy Sniffer load testing tool allows measuring response time and stability of web applications like E-banking, portals or web-shops, under heavy load conditions by simulating thousands of concurrent web user-accesses. NeiIQ's Vivinet Assessor tool can simulate heavy network traffic while measuring QoS metrics,in order to determine the ability of a network to carry VoIP traffic.

The Simple Network Management Protocol (SNMP) and Management Information Base (MIB)

The most widespread protocol being used for Network Management is the Simple Network Management Protocol (SNMP). Both SNMP v1 and SNMP v2c are using the same "clear-text" sniffable ASN.1/BER encoding as SNMP v1 and authentication scheme based on community strings. SNMP v3 brings mainly security improvements over SNMP v2c, with a framework for authentication, privacy and access-control as well as the ability to dynamically configure the SNMP agent itself using SNMP SET commands against the MIB objects that represent the agent's configuration. The SNMP v3 "User-based Security Model" (USM) defined in RFC 3414. USM specifies the use of a pair of shared secret keys (one for authentication and the other for encryption), the use of a digest-base (MD5 or SHA1) authentication mechanism and a 56 bits DES-CBC encryption mechanism.
The SNMP framework assumes that every "managed" system exposes the functionality of an SNMP "agent" that can perform local management functions and report on local information when commanded to by a remote management system. The information elements to be managed or reported on are conceptually organized in a tree-like structure, named the Management Information Base (MIB), of which 2 categories exist:
  • Vendor-specific MIB definitions, under iso(1).org(3).dod(6).internet(1).private(4).enterprises(1)
  • Standard MIB definitions, which all vendors must comply to, under iso(1).org(3).dod(6).internet(1).mgmt(2).mib-2(1)

The SNMP Agent Extensibility (AgentX)framework permits dynamically registering managed objects without having to interrupt the service or restart the SNMP agent.
The Remote Network Monitoring Management (RMON) is not a management protocol apart but a SNMP-based management framework based on distributed, SNMP-managed RMON probes (or devices, e.g. Catalyst switches) that sniff network traffic, accumulate statistics and allow retrieval thereof through the RMON2 MIB

The DMTF Common Information Model (DMTF-CIM)

The Common Information Model (CIM) is a DMTF-defined standard representation of a manageable IT infrastructure as an object-oriented, UML-based hierarchy of classes representing categories of physical or logical IT infrastructure elements. It consists of a schema, which formally defines the actual model in 3 formats - MOF, UML and XML. The CIM schema itself is extensible and consists of a small set of base-classes (the "core" model), the "common" model that extends the "core one" to represent technology-agnostic management areas, and the model "extensions" classes that extend a "common" model for a technology-specific area.

The Web-Based Enterprise Management (WBEM) framework

The Web-Based Enterprise Management (WBEM) industry initiative aims at unifying management of distributed computing environments through compliance to a set of common standards based on web technology. WBEM is system management framework based on CIM servers and clients communicating through a XML-encoded request/response protocol.
Vendors like HP,BMC,Cisco,IBM,CA,SUN and Microsoft support WBEM or flavors thereof.
Some commercial and Open Source WBEM software frameworks exist, like the mature Pegasus, the WBEM Services Java framework or the Microsoft Systems Management Server (SMS). The CIM Query Language allows a WBEM client to request a WBEM server to return a set of management information entries corresponding to a complex search criteria.
The WS-Management specification defines a CIM-compliant management framework based on Web Services and the SOAP protocol.
The Windows Management Instrumentation is Microsoft's flagship system management approach, claimed as being WBEM and CIM compliant in terms of architecture, schema and protocol and features more than 100 WMI providers in Windows Vista.

The Telecommunications Management Network (TMN) framework

The The Telecommunications Management Network (TMN) is the open-systems network management model defined by ITU-T,defining a hierarchy of management layers interconnected through interface points (e.g the Q3 interface) through which Operations Systems, Network Elements and Mediation Devices exchange information across layers using the Common Management Information Protocol (CMIP).
The Common Management Information Services Element (CMISE) defines the service primitives used to invoke services on the Managed Objects, at the TMN Q3 interface point. The primitives are mapped on top ot Remote Operations Service Element Protocol (ROSE) application-layer protocol, defined in ITU-T standards X.219 and X.229.
The Common Management Information Protocol (CMIP) is part of the ITU-T OSI management model defined in the X.700 series standards and s a client server protocol, where the management system uses the manager and the managed equipment contains the agent.
In the agent side there is a MIB (Managed Information Base) containg a logical representation of the managed real resources described as Managed Objects, described using the ASN.1 syntax defined by Guidelines for Definition of Managed Objects (GDMO). The TMN is maintained by the TeleManagement Forum (TM Forum), a consortium of over 500 companies, among which major telecom equipment suppliers feature compliant CMISE/CMIP TMN implementations (e.g. Nokia, Alcatel, Ericsson, Siemens) and major suppliers of Network Management software support TMN with their products (Harris, HP INvent, IBM/Micromuse).

Network Management solutions for Wireless Environments

The Wireless Infrastructure Provider market is still quite fragmented, although uncontested market leaders like Cisco, Aruba Networks, Symbol or Meru start to emerge. Comparing the Gartner "Magic Quadrant for Wireless Infrastructure" for 2006 and 2005 one can see that the market consolidates around a few leaders. Consequently, leading Wireless Network Management solutions tend to provide good support for equipment from multiple vendors.
The wireless LAN market surpassed $3.6 billion in 2006, growing 15 percent over sales in 2005, with Cisco leading the pack, growing at a steady yearly 35 percent, followed by Symbol, while the fastest-growing vendor for the segment was Aruba, whose sales grew 62 percent to the No. 3 spot.
It is expected that by 2010 over 80% of North American organizations would adopt wireless LANs, with many that would opt for deployment of mobile VoIP solutions, taking advantage of the increasing number of PDAs and mobile phones incorporating WiFi technology. However, as the technology spreads, its challenges, mainly on the security- and service-level side, are to be addressed by specialized Network Management solutions for Wireless Environments.
Beside the need for Availability- , Fault- and Performance Management, managing a Wireless LAN infrastructure involves much more emphasis on security and roaming than the management of the "wired" network environment, of paramount importance being the Network Change and Configuration Management (NCCM), the Intrusion Detection and Prevention (IDS/IPS) as well as the automatic monitoring and adjustement of signal-coverage, interference patterns and roaming performance.
From a security standpoint it is important that the WLAN Management System monitors in real time the network for the presence of unauthorized ("rogue") devices - either AP or clients - alerts when it finds any and eventually block or disables them.
Specialized Intrusion Detection/Prevention commercial solutions exist from AirMagnet, AirTight Networks, AirDefense and Adventnet.
There are also Open Source and free wireless security solutions, most of them having evolved from hacking tools - NetStumbler, FakeAP, WifiScanner, Kismet, AirSnare, wIDS and others. A "all in one" solution is the AirWave Management Platform (AMP).

Management Agent Simulation solutions

Simulating the behavior and functionality of real-device management agents, mainly for CIM-XML, SNMP and TL1, is extremely useful for validating Network Management solutions before releasing them, be it by Software Houses or Information Technology Service Providers. There are different categories of such simulators:

New Generation Operations Systems and Software (NGOSS)

The term " New Generation Operations Systems and Software" (NGOSS) encompasses a new business model supported by a set of architectures, technologies, methodologies and business processes meant to reduce costs, drive revenues and enable telecom companies to competitively provide and manage the new, converged network-based services.
NGOSS is being touted as an Enterprise Application Integration (EAI) framework for Operations Support Systems (OSS) with the same economic benefits for service providers as EAI delivered for enterprises.
The adoption of NGOSS by major Telecom service providers advances at very slow pace, due to perceived risks and high necessary investment.
The NGOSS initiative is driven by the TeleManagement Forum (TMF or TM Forum), which is an industry-alliance of major Telecom Service Providers, Network Infrastructure Suppliers and OSS Application Vendors.
The main architectural components of NGOSS are:
The OSS Through Java (OSS/J) aims at standardizing interfaces of NGOSS-compliant software components and is already being used by a number of NGOSS implementations