Category Archives: Tech Business

Checksum Mitigates Network Risks

Checksums are used to make ethernet communications robust. I’ll attempt to explain how that works and why leveraging checksums in a network qualification is efficient and effective.

Practically every packet of information passed in an ethernet network is carefully checked for accuracy and, if the test fails, the system automatically requests the packet be resent. Ethernet devices like the network interface cards (NIC) in computers follow strict guidelines and communication protocols before passing data from the network on to the computer.

While an ethernet device prepares a packet of data to be sent over the ethernet, it uses an algorithm to create a checksum from that data. The checksum cannot be used to re-create the data, but it is used to ensure that the data is identical to the original data that formed the packet.
sack0You might ask, “How can the checksum can be part of the data packet it is a checksum of?” The data is like the contents of a sack of lunch and the receipt is the checksum. The receiving network card is designed to separate the datagram from the header and checksum portions of the packet and perform the test: just like I have no problem separating the paper from my lunch.

Ethernet networks like the internet were designed to provide for this kind of inherent testing so the infrastructure would not need to be dis-assembled for testing. The components with the greatest risk of failure are not the wires and fibers but the interface cards and devices that they connect.

Inherent testing assures that tests include the network translation devices including the connection interface where the “rubber meets the road” so to speak. It is critical that the tests include these connections “in situ” since even a tiny piece of lint can scatter the light in fiberoptic connections or increase resistance in copper connections.

The checksum tests are embedded in the underlying protocols that define how data are sent over the network. They cannot be turned off by end users or adjusted to allow a percentage of error: any anomaly requires a resend of the entire packet. Persistant failures result in a complete loss of communication, not deilvery of inaccurate information.
sack1Let’s say that someone offers to drive to town and get a lunch for me. If my goal is to mitigate the risk that they deliver an inaccurate lunch, I can approach the problem from two angles:
I could check their car over and carefully survey the road surface for the route I expect they will travel.
Or, knowing that they might take either the Interstate or the county road, I could just wait until they return and check the receipt (i.e. checksum) against the contents of my lunch sack. If one BYTE is missing from my hamburger, the checksum test will fail and I’ll send them back for another entire packet, er I mean, sack lunch.

With the checksum approach there is an opportunity for a slow lunch delivery – especially if I keep sending them back until they no longer eat part of my lunch – but the risk I wanted to mitigate was the opportunity for an inaccurate lunch. Testing the wire and fiber components of the infrastructure don’t do anything to assure the integrity of the packets delivered: if they don’t precisely match the checksum they will never be delivered.

So, how can we quantitatively determine the state of our network infrastructure from the NIC of the server to the NIC of the client PC and include any other ethernet data acquisition devices without unplugging anything? We could add up the number of resend requests that are sent when data packets fail the checksum test. Switches and hubs have been designed to track such success parameters as that for many years. Investments made to report this data would surely provide a return by empowering the organization to track and trend the true health of the network.

Reports and summaries of packet failures are nice, but not necessary to leverage the power of the checksum in network qualification documents. All that is needed there is a savvy explanation of how all your network devices checksum essentially every ethernet* packet for quality and automatically requests the data be resent on error.

Hopefully this blog entry will assist you in that task.** You are welcome to use any part of it without reference but please leave me a comment if it has been helpful or entertaining. You are welcome to reference the work in it’s permanent archive:

* Note that the checksum cited here does not apply to non-ethernet protocols like RS-232 and RS-485 serial communications. Implementations of serial and other connections require some other means of qualifying accuracy.

** The author is not responsible or liable for misuse or interpretation of the information presented here: USE AT YOUR OWN RISK!

Keep IT Simple

Working in an IT dept. of a medium-sized business, we are often asked to “integrate” a couple of applications. The requestors may or may not have gone through the research necessary to the specify the integration they need. They may already have both applications, and they can specify the fields from one product that they simply want replicated to another, but upon furthur questioning, may still want either application to be used to enter these records and, of course, both applications should syncronize automatically. These “random integrations” can consume large amounts of time and may never fully satisfy the requester before one application needs an upgrade.

Unless the IT dept. has lots of time to spare, the only hope for success is to keep the project as simple as possible. Find out the ultimate goal of the requested integration. If management requires reports that include information from two distinct data sources; it may be possible to write a report that organizes information from both sources. That is a lot less work than to augment records from one source into another application. It’s also less likely to require extensive attention when either application is upgraded: it is simply a combined report. (Labor saving integrations will be discussed another day.)

For instance, let’s say you have one program that tracks employee’s time toward projects, and another accounting program that tracks income and expenses other than labor to these projects. The obvious report required would answer the question, “Which projects are making/losing money?” It may be a lot easier to combine data from both sources into a single report than to incorporate records from one database to another to effect the same report.

It may also provide more current data to leave the original databases alone. If users enter their time every day, a combined report would be updated daily. But if the accounting system gets an infusion of their time only at the end of a payroll period, then management would only see current data around payday.

Old data is of little use: even management cannot change the past. The more critical timing is, the less we want to rely on direct replication of records from one database to another. It’s also so much simpler: what will happen when a record that has already been replicated gets changed?

So there you have a case for keeping it simple.