Extracting Linework from Vector PDF Files

In the following video we look at a fairly typical PDF file that has been georeferenced into a TBC Project and how to extract linework from the PDF.

We look at three different approaches

  1. Vectorize the PDF to pull out all of the lines and then sort out the lines that are needed e.g. for Finished Grade and Existing Contours
  2. Vectors on Demand - this allows us to extract the contours one at a time - in this example the Design Contours are problematic because they are drawn as arcs and lines that are not connected, so we show you some techniques that you can use to extract those and join the arcs together using Smart Join
  3. Lastly we show you how you can use the Takeoff Lines command to digitize the linework using the vector snaps, to draw in the last elements of the lines that are required.

PDF Vector lines are typically drawn in one of two ways in AutoCAD (that generated the PDF file most likely).

  1. Lines can be drawn as polylines, in this case the lines will be continuous lines through all elements of the line. However the polylines can be drawn with e.g. dashed linestyles which means that in the PDF file you have a lot of short lines that have to be connected together. When you vectorize the entire PDF page, TBC offers the ability to join dashed lines together - in that case it looks for sequences of Dash and Gap repeat patterns that are in sequence in the file and connects the dashes together for you. When you use vectors on demand, we are simply looking at a series of vector point nodes held in a list and tracking forwards and backwards from the node that you select as the seed point and looking for a jump larger than the max distance setting or a change in direction greater than the max deflection angle setting to determine where the start and end of the line is. When the original line in AutoCAD was a polyline, whether it be drawn as a solid line or dashed pattern line, either mode will work quite successfully.
  2. Lines in AutoCAD can be drawn as CAD Lines and Arcs - these are separate line elements that are not connected. The line and arc segments of the same line may not be in sequence and may not even be drawn in the same direction - Arcs are always drawn in an anticlockwise direction whereas the line segments can be drawn in either direction. This creates a challenge when you want to join them together automatically. The current Vectors on Demand tool will; pull the arcs out easily, and that is what you should do with the data like the Design Contours in this file. This also applies to lines that represent e.g. Curb and gutter lines on most site projects. Once you have the arcs extracted you can use Smart Join to join the arcs together with Simple Join mode which creates nice clean linework. Where the first or last segment of a line is a straight line you can use e.g. Takeoff Lines in combination with the Append Tracked Line command (in the header bar of Takeoff Lines) to manually draw in some elements and then track the arcs into the line that you are creating.

There is also a third type of line drawn in AutoCAD that also creates its own issues - these are the extra bold lines - PDF can only support lineweights up to a certain width, after which it draws lines as solid areas as shown below here

These are not a continuous line that you can easily extract - the best solution is to digitize those types of line using Takeoff Lines.

You can spot the second type of lines by looking at the ends of the arc elements - you will find small defects in the PDF where the Arc joins the line elements - this tells you that there is a gap between the two elements as shown here - if you extract all of the lines using the Vectorize PDF you will see the gaps between the straight and arc elements as also shown below. This tells you that you are dealing with discontinuous segments. These are where Smart Join with the Keep Shortest Mode really helps as that will join the lines and fix the gaps for you in a smooth manner that will also help you downstream in your Takeoff.

The key to successful use of the Vectors on Demand (VOD) is being able to spot the difference between the two types of line above and knowing how to setup the VOD settings appropriately to generate the right selections.

Video Shows you How

2 Likes