Timo's blog: best practices

Showing posts with label best practices. Show all posts

June 10, 2020

Tale: ETL component

In this occasion we will see a situation that unexpectedly generated a great design, by purpose and by accident as well.

Here we've had a customer with a legacy system that stored all its data in a local data source, this is a very old system, written in Fortran, that handles huge amounts of data and it's rarely updated.

As we can already expect, this is not a simple system to integrate with external participants, and that was exactly what they needed. A partner had a website that was well done and wanted to integrated with the legacy system data, meaning that data could be fetch from that system and updated as well.

Our team was asked to provide a solution for this scenario, and here comes out handy the figure of an Architect, we were a young team, with very few years of experience and almost no insight on what to do next.

The Architect took the lead, he designed an ETL (Extract/Transform/Load) component that would act as a mediator between the systems, it would read data from the legacy system and synchronize it to the partner web system, and consequently it would take the data from the partner web system and synchronize it back to the legacy system.

The behavior was designed in this way because the legacy system couldn't be disrupted, a single misbehavior could take the system down, synchronous communication was out of the picture, everything needed to be asynchronous to ensure a stable integration.

Our team worked well and delivered the first ETL to production with some small issues, but worked well, the synchronization process between the Legacy and the External partner was performed every two hours and worked really well.

But after a few weeks in production the customer and owner of the Legacy system saw a problem, the amount of data changed by the External partner was huge compared to their normal changes rate, the Legacy system was getting slower because of this integration and data synchronization.

Their Legacy system had a great feature for this scenario, they've created a Replica environment that worked for read-only data (which amounted to 90% of the operations) and the original system where data could be written. The new schema solved the problem on their side, but the ETL needed to be updated, it needed to look now at 2 different data sources, with different behaviors.

Our team's Architect took the lead again, he said: "the ETL remains unchanged, we will fix this in the wire". His decision was to put queues in front of the ETL to handle the data traffic, and install agents in the Legacy systems and the External partner system, this agent would monitor the data and communicate with the queues with structured messages, asking for data (read) or performing a data modification (write).

The changes were done in two weeks, the performance remained the same, the ETL performed consistently without changing a line of code, the Legacy system's load was controlled and the External system worked really well and fast.

Backwards Analysis

Thinking back, the project went really well, the development process was smooth, integration was simple and a lot of things that usually go wrong were avoided. Many of these benefits are related to some key decision made by the Architect for the beginning:

Functional programming as main development paradigm
Automated test covering critical cases
Periodical automated build
Asynchronous communication
Ensured communication channel

July 18, 2016

Personal current trends on software development

Most productive development environment (server-side):

Java 8 + Spring Frameworks + SpringBoot
jOOq or Hibernate
Maven as build tool
Eclipse or IntelliJ as IDE

Most productive web development environment (single-page applications):

React.js with WebPack
Typescript
Atom or Vim as editor
NPM as build tool

Most productive mobile development environment:

Groovy as programming language
Gradle build tool
IntelliJ as IDE

Most interesting programming model:

Most useful getting-things-done tool:

Go programming language
Docker

June 18, 2013

Is not about the tool

Over and over I keep listening and reading in software development forums about certain types of questions:

What is the best tool to do ?
What is the perfect IDE for ?
What can I use for modelling in ?

I know this because I started with those questions as well when I started a few years ago, but when dealing with different things during my short experience I have learned that the answers were never going to be enough, because the questions never were the correct ones in first place.

Let me explain that, the questions I (and some other developer folks) was asking during those initial years in my experience could be translated to other contexts in this way:

What is the best hummer to build a chair?
What is the best brand for office furniture?
What pencil can I use to draw amazing designs?

If you can read them well, you can understand how lame they are. I should have focused first in the concepts that enable those tools to be useful. The tools are just the medium to build something, the major part of the problem is to identify and understand what to build first.

Of course is very useful (and mandatory) to know about the tools, but the effort should be directed to understand how thinks do and should work. That's the reason why the Business Analyst title is so hot at the moment, we as developers have sent the system's business to the background of things, that's my advice, focus on the concepts, business and things that have value, the tools will come eventually.

September 16, 2012

Domain driven semantics

If you are a programmer, you have dealt with persistence layers in many senses:

RDBMS systems
File systems
Remote storage
Just memory...

And if you are a programmer, you know that all those persistence layers have to be abstracted from the application's business logic. There are several techniques to achieve this:

All those techniques wrap your data from the persistence source you are using, because your application should have its own 'language'.

Why?

Because each application is meant to solve a specific problem (or a set of problems), and that problem has a domain, and that domain will be used as basis during the development/maintenance phases. During these phases you might have different people working with the same code. So, the developers will need to understand what they are modifying in order to add/remove/fix specific parts of the system.

This is the point where documentation and comments become important, I also take a deep look at the context applied to the programming, I mean, class names, attribute identifiers, method names, etc.

All these things give the source code a meaning, what can you say from this?

and what about this?

which one is more verbose? which one has more meaning? which one would you prefer to have in your source code?

Even though the first chunk of code uses JPA, it is verbose enough, the second one really does its job, you don't need to know whether you are using a database or not and it is really easy to read, if you want to modify a condition or change the behavior you just go to the specific method, and that's it!

March 2, 2012

Dead code evilness

Working on several projects at the same time is really exciting, you have to administer your time-sheet between tasks and work with different people. Hence the diversity, different technologies and environments.
But there is something common between all those projects, dead code. Dead code can be classified in some few categories:

Legacy code.
Refactoring-result code.
Patches.

It's really frustrating for someone during development to see some code like this:

What does the '0' mean?
Why is the first line still present there as commented code?
Is it really important to have it there?
Should I remove it?
What about the other parameters?
Why it was commented in first place?

This can lead you to several problems if you don't know how to answer these questions. But then again, why would you? Wouldn't be better to have something like this?:

If it's dead code, I mean, it's already dead, it should be removed for the sake of other developers. If you are working with Revision Control Systems (and I don't know why you shouldn't), the commented code is not longer required, you can revert or reapply a patch whenever you want.

In summary, get rid of your dead code, kill it, burn it, bury it.

December 23, 2011

People it is worth following

Among software development there are a lot of resources, a lot of people worth of looking at it, but to find out who is good enough to follow may take some time. That's why's I'm summarizing the people I enjoy reading/listening to in any topic related to software engineering and development in general.

Robert C. Martin
The 'Master Craftsman', I've read some of his books and are pure gold, his experience and knowledge are outstanding, and you can still see him coding!

Old blog http://butunclebob.com/ArticleS.UncleBob (it has some very interesting posts)
New blog http://blog.objectmentor.com
Clean coders http://www.cleancoders.com/, a bunch of great software development resources

Joel Spolsky
One of the heads behind StackOverflow, a hardcore developer that has been doing really cool things for more than a decade. His blog posts are extremely interesting and very, but very useful.

Blog http://www.joelonsoftware.com/
Hg init tutorial http://hginit.com/
The Joel test
StackExchange http://www.stackexchange.com/

Jeff Atwood
Another of the heads behind StackOverflow, a very skillfull developer that has been delivering lots of tips, principles and advices for years.

Coding horror http://www.codinghorror.com/blog/
StackOverflow podcast http://blog.stackoverflow.com/category/podcasts/

Miguel de Icaza
He is the mind behind the initial Gnome project, started with Mono and now it's driving his own company called Xamarin. He might be controversial at times, but for sure it is worthy to follow him.

Blog http://tirania.org/blog
Xamarin http://xamarin.com/
Mono http://www.mono-project.com/Main_Page

Scott Hanselman
One of the most productive developers I've seen, he's a .Net evangelist with years of expertise. His projects and blog are just amazing. Besides that he's a podcaster with various memorable recordings.

Blog http://www.hanselman.com/blog/
Hanselminutes http://www.thisdeveloperslife.com/
This developer's life http://hanselminutes.com/

September 27, 2011

Code freeze

When you are developing something in a constant pace and everything is moving smoothly, how do you test?

Do you stop coding and then test what you've done? and then go back to code?

This may work for some people, but I struggle when I have to multi-task (this does not apply for TDD) and some of the people I know have the same feeling.

A while ago I worked in a project where there was a 'code freeze' day, in this day nobody could commit anything to the repository and everything you had to do was to test and register the bugs you've found.

This was something conflictive for us, our how development model was to produce something and to deliver as soon as possible, trying to fix the bugs as soon as they were registered.

Advantages

One of the advantages we found using this practice was that the bugs were found faster.
We could plan and estimate all our bugfixes.
All bugs were easily reproducible.
All bugs were registered and assigned properly.
We switched roles, for one weekday we turned into testers (and loved the idea of relaxing while working).

Interesting links

Feature "Feature/Code Freeze"

Menu