Secu­re coding is no lon­ger new in soft­ware deve­lo­p­ment. It has also been estab­lished for a long time, you can find a wealth of infor­ma­ti­on about it, the­re are cour­ses, tuto­ri­als, and ever­y­thing is well docu­men­ted. All deve­lo­pers should the­r­e­fo­re be using it by now. Unfort­u­na­te­ly, deve­lo­ping secu­re soft­ware does not start with a simp­le to-do list and can­not be accom­plished wit­hout the neces­sa­ry sup­port at all levels of modern appli­ca­ti­ons, which con­sist of a multi­tu­de of com­pon­ents and inter­ac­tions. Let’s brief­ly out­line why this is so and how to get started.

Code stands on its own

Once code lea­ves the deve­lo­p­ment and test envi­ron­ment, it must func­tion on its own at run­time wit­hout super­vi­si­on. Com­pu­ter sys­tems, after all, are pre­cis­e­ly for auto­ma­ted pro­ces­ses that are lar­ge­ly unat­ten­ded. Only in cases of error or excep­ti­on does soft­ware gene­ra­te mes­sa­ges or results that must then be pro­ces­sed by a human. The­r­e­fo­re, it is neces­sa­ry to fore­see the future as well as pos­si­ble during deve­lo­p­ment and to pre­vent situa­tions that may occur. Exact­ly the­re beg­ins the area whe­re poten­ti­al secu­ri­ty vul­nerabi­li­ties find a good hiding place.

To illus­tra­te unknown thre­ats, the Inter­net is often cited. In fact, any pro­duc­tion envi­ron­ment can be used for this pur­po­se, becau­se even in the tidiest net­work you can find suf­fi­ci­ent­ly com­plex pro­to­cols, data and situa­tions that were not plan­ned. Soft­ware must not fail in any envi­ron­ment. The pro­gram flow must have a con­sis­tent sta­te at all times. A test envi­ron­ment must take this into account. It must not be for­got­ten that soft­ware is some­ti­mes in use for years or deca­des. With the­se time spans, not ever­y­thing can be fore­seen. The best exam­p­le is stan­dards and spe­ci­fi­ca­ti­ons. Uni­code is an indus­try stan­dard for con­sis­tent enco­ding, repre­sen­ta­ti­on and mani­pu­la­ti­on of text of most fonts used world­wi­de. The cur­rent 11.0 stan­dard from 2018 con­sists of 137,374 cha­rac­ters. Com­pared to the 2017 ver­si­on 10.0, 684 new cha­rac­ters have been added. How many will the­re be in the next stan­dard, what will their mea­ning be, and what codes will be assi­gned to them? Can my appli­ca­ti­on still cla­im to have Uni­code sup­port? So you can con­fi­dent­ly think about what exact­ly to do with unknown enco­dings in cur­rent soft­ware, becau­se this pro­blem will remain. Cases like this can be repea­ted with arbi­tra­ry data for­mats and trans­mis­si­on protocols.

Con­sis­tent states

An appli­ca­ti­on should be con­sis­tent all the time. This theo­re­ti­cal requi­re­ment is a good start into the world of secu­re coding and secu­re design. Code con­sists of com­pon­ents and usual­ly runs with the help of an ope­ra­ting sys­tem. This means that func­tions are con­stant­ly cal­led that pro­cess resour­ces or dis­tri­bu­te tasks. Pro­vi­ded that all ope­ra­ti­ons can be car­ri­ed out wit­hout errors, the­re is not­hing to be done about error queries. Only the code’s own data struc­tures must be main­tai­ned. This is the first task of the deve­lo­pers. Crea­ti­ve thin­king is requi­red here as well. What hap­pens if the pro­gram does not ter­mi­na­te? We have seen sys­tems in pro­duc­tion envi­ron­ments that have been in con­ti­nuous use for 7 years. Espe­ci­al­ly with tele­pho­ne sys­tems or infra­struc­tu­re, this is often obser­ved. This means more than 220,752,000 seconds of ins­truc­tions, data acces­ses and pro­ces­sing. Unit tests can­not map such sce­na­ri­os becau­se the times are too long and thus the pos­si­bi­li­ties of code paths are too diverse.

Focus on excep­tio­nal situations

If the data is “nor­mal”, tests will usual­ly work and find errors. In the peri­phe­ral are­as, whe­re excep­tio­nal situa­tions pre­vail, it beco­mes exci­ting for every appli­ca­ti­on. The pur­po­se of test cases is to find soft­ware bugs and to detect whe­ther fixed bugs remain bugs. With secu­re coding, howe­ver, you want to chall­enge the code with situa­tions that have not yet occur­red. This can only be achie­ved with auto­ma­ted tests that con­stant­ly chan­ge input data and check whe­ther the soft­ware can cope with it. The method is cal­led fuz­zing. It is very old and ori­gi­na­tes from the time of punch cards. At that time, defec­ti­ve and inten­tio­nal­ly taped or addi­tio­nal­ly pun­ched pun­ched cards were used as input. In com­pu­ter sci­ence this approach has expe­ri­en­ced a renais­sance in the last deca­des. Today, fuz­zing is part of the stan­dard reper­toire of soft­ware development.

It is very easy to get star­ted, sin­ce Con­ti­nuous Inte­gra­ti­on and Build Tests have auto­ma­tisms any­way. Fuz­zing extends this pro­cess by algo­rith­mi­cal­ly ran­dom­ly varied inputs. The start­ing point is nor­mal data, which is used as an exam­p­le. One can addi­tio­nal­ly enrich the test cor­pus with extre­mes (espe­ci­al­ly lar­ge, nes­ted or other data that usual­ly never/rarely occur, but push the limits of the data for­mat). The advan­ta­ge is test­ing wit­hout super­vi­si­on as well as gene­ra­ting more test data if the pro­cess found errors. Inte­gra­ting fuz­zing is the easie­st way to get star­ted with secu­re coding. Find the data that threa­tens your appli­ca­ti­on befo­re the atta­ckers do.

René Pfeif­fer is a free­lan­cer at SEC4YOU in the field of pene­tra­ti­on test­ing, IT secu­ri­ty con­sul­ting and secu­re coding. He regu­lar­ly pro­ves his com­pe­tence in chal­len­ging secu­ri­ty pro­jects. For ques­ti­ons you can reach René Pfeif­fer via our cont­act options.