d
i
g
i
t
a
l
digital equipment corporation · maynard, massachusetts
MEMORY ADDRESS
EMA
RUN
PDP-8 (Straight 8) Computer Functional Restoration


Now that all obvious issues have been fixed, power supply checked separately, and the machine reassembled I powered it up to see what was working. The picture above shows the first power on of the entire computer. One obvious issue was the run light being on though the machine should be halted. The rest of this page documents what I did to get it back to fully functional.

The current status of the computer is it running without known issues. Margin testing has not yet been performed to ensure stable operation. So far 15 cards in the processor section needed repair. On the cards three transistors and 26 diodes needed replacing. One transistor also needed replacing due to an error during troubleshooting. The backplanes contain 230 cards, approximately 10,148 diodes, 1409 transistors, 5615 resistors, and 1674 capacitors. Cards only used for interconnect were not counted. That's only 6.1% cards with faults, .26% of the diodes bad and .21% of the transistors bad. My parts count spreadsheet in OpenOffice format   in Excel format

Troubleshooting method

Since others are unlikely to have exactly the same failures I'll first start with ramblings on the general procedure I used to restore the machine to operation then will ramble on with excessive details about particular failures.

The approach I used to fixing the machine is to start with the most basic issue, fix it then move on. The steps have some overlap but the general order I followed was:

  1. Fix run light stuck on
  2. Get load address to work
  3. Get reading and writing to small region of memory locations to work
  4. Get all front panel lights and operations to work
  5. Get simple programs toggled in from front panel to work
  6. Get Teletype interface to work
  7. Get diagnostics to pass in suggested order
  8. Get machine to operate over some range of margin voltage
There were deviations such as playing Focal lunar lander after the machine looked stable enough to try.

R Series Logic information

The R series logic used in this machine is diode transistor logic. The actual logic operations are performed by diodes with transistors amplifying the signal and shifting it back to proper logic levels. The logic is pulled up to ground by a transistor when on and pulled down to about -3V when off by a resistor. The logic notation used indicates for the particular signal if 0 or -3V is considered true. Some of the boards have expander inputs which are not at normal levels. When funny levels are seen check the input type. The R series logic used a notation in the schematics that isn't used anymore. It's explained in this document and the earlier logic handbooks available here.

Tools

For examining signals I used a tube Tektronix 547 analog scope, a Rigol 1052E digital scope, and a Philips PM3585 logic analyzer. I also used several Digital Volt Meters (DVM) both handheld and bench top. I used a Pace desoldering station for removing components and Weller and Ungar temperature controlled soldering irons.

Logic troubleshooting

After I have identified something that wasn't working properly I looked at the schematics to find the logic that controls that operation. The machine and boards were revised during their production run and the documentation I have didn't come with this particular machine so I had to look out for whether the schematic versions I was using matched my machine. This apparently was an issue when people were using these machines for real work since one of my maintenance manuals has the Teletype interface schematic labeled "not our 8". From the schematic I start probing at a likely spot until I find a signal that is wrong. After identifying a bad signal I trace the logic back until I find a point where all the input signals look good and the output signal looks bad. By bad the signal can either not switch when it should, switch when it shouldn't, or the voltage levels can be wrong. When I have a spare I swap the card(s) driving the signal to see if it fixes the problem. That only sometimes fixes the problem.

There are multiple ways a signal can be bad. Since the R series logic is only actively driven to ground, multiple outputs can be wired together to perform a logical OR function. This is even done on flip flops where they can be set by pulling the output to ground. Since the schematics don't have any cross reference information, it can be a challenge to know if the signal has multiple drivers. Another common failure is the input of a card is bad so it pulls the signal towards ground. Finding all cards that receive a signal is just as hard as finding all drivers. If I had spares I would swap the driven cards next. If I didn't have spares I would put a piece of tape over the driver's output contact to isolate it. Make sure you never disconnect the collector clamp to -3V and leave a pull-down to -15V or you will kill the transistor. If the signal is now good then you know it's another card. That procedure can be followed until you find a card which isolating fixes the signal.

Logic troubleshooting example

To troubleshoot the run light stuck on I found the flip flop that holds the run state in the machine schematics which was in the R202 at PB34. The straight 8 backplanes separate making it very easy to probe the signals so I started checking signals (relevant schematic pages page 10-41 and page 10-97). I probed the run flip flop output and verified it was stuck on. That let me know the problem wasn't something like the bulb driver was stuck on. One of the ways the run flip flop is cleared is by the R111 at PA27. No T1 pulses that should have been present when running were received at the K input. I traced the missing pulses to the S603 in PB36 where the inputs were fine but the F output wasn't switching.

After replacing that S603 card the T1 pulses were present but the run light was still stuck on and now the machine was stepping through all memory addresses. I probed more and found that pulses were present at the R set input to run flip flop. This signal is driven by several locations. Going through them and using tape to isolate each driver I finally found that the S603 at PF28 was generating pulses on its output with no change of signals on the relevant inputs. Replacing that card finally got the run light to stay off.

Due to other faults in the memory logic it cleared all the core when it stepped through all memory addresses. I was hoping to see what was the last program it was running. It probably would have been better to have left the core uninstalled until the machine was tested as much as possible without memory.

The MAINDEC's were designed to both detect failures and to help in figuring out what the failure is. Reading the operation and relevant section of code will frequently give you enough information to tell what is going wrong. For example when running MAINDEC-08-D05B the machine sometimes halted. The halt was at address 1001 octal. The relevant portion of the operation of the diagnostics is it puts an I/O Transfer (IOT) and jump subroutine (JMS) instruction at a random location in memory then executes a jump (JMP) to it. Memory location 0130 stored the JMS location which was 1200 so the IOT was at 1177. The likely cause is that after 1177 instead of incrementing the program counter (PC) to 1200 the carry didn't work and the PC went to 1000 instead. It then executed a halt (HLT) so the machine halted with PC 1001. Fixing the PC bit 5 carry problem on the R210 at PA13 fixed the problem. The logic analyzer captures prove this is the cause.

Component level troubleshooting

After I found the bad card I then tried to identify the failed component. First I used my transistor tester to test the transistors. Then I used my DVM on diode test to test the diodes then on resistance to check the resistors. The other components connected to the part under test can cause the reading to be wrong so I look at the schematic to see if the particular component has anything obvious that should interfere with the measurement. For many of the cards the logic is replicated multiple times so the copies should measure similar. Only schematics are available for the cards but not the assembly prints which makes it difficult to troubleshoot the card. I took a photo of the card and lightened it using Gimp and printed it. I could then write the component reference designations on it for the portion of the card I was trying to fix. I would then put the card on an extender and probe it with a scope to determine the cause of the incorrect behavior seen.

The type of problems found are:

After finding "the bad component" I would remove it and see if it still seemed bad. If it did then I would put a new component in and see if that actually fixed the board. If not repeat. Luckily I found years ago at a hamfest a set of DEC spare parts boxes so have a reasonable number of diodes and transistors.

After fixing the card and verifying that particular problem is gone it's time to find the next problem and repeat. The annoying failures are the intermittent ones. For them I would try to find a method of getting it to repeat often enough to troubleshoot. Then I would figure out how to trigger the test equipment on the failure to trace it back to a wrong signal.

Component troubleshooting example

Now for a specific example of a card that was more difficult to troubleshoot. When trying to send a character out the Teletype interface nothing happened. The IOT signal was setting the TTO enable flip flop but it cleared after three microseconds though no input signals changed at that time. The TTO enable is generated by a R220 at location ME22. Visual inspection of the R220 showed the card had failed sometime in the past and had components replaced but no other obvious issues. Measuring the components showed one diode with forward drop of .4V vs the normal .6V. I also found two resistors which read about half the proper value. The resistor values reading wrong were due to a solder bridge that shorted two traces together. I fixed it though it looked like it was from when the machine was first made so may not have been causing problems. I also changed the diode. Neither of these actually fixed the card. After finding where the components were for the bad output I started looking at signals. I found one of the transistors' base drives was off for a while then slowly crept down until it turned the transistor on switching the output. I found a similar decreasing voltage on the other side of a diode. I then got a couple k resistor out and held it on to pull the other side of the diode up (after verifying that this won't cause damage). That made the output stay at the proper level verifying the diode was bad. I then replaced the diode and the output signal was correct.

Memory troubleshooting

Now that the machine wasn't doing obviously wrong things I tried to see what was working with the memory. When I wrote to memory the contents were always zero and the front panel didn't look like writes were working properly. To write to memory you set the program counter register with the front panel then set the switch registers to the value to write and hit the deposit key. That causes the machine to load the switches into the accumulator and execute a deposit and clear accumulator (DCA) instruction. I could see the switches were loaded into the accumulator but the instruction decode showed AND (000). This failure was caused by the S203 in PB28 pulling the clear input to the instruction register active. Replacing the card got the deposit switch to show the proper DCA instruction.

After I got all the obvious logic issues straightened out the memory showed some signs of life but many bit errors. I checked the read, write, and inhibit currents and they were OK. I then performed the memory alignment procedure. First I adjusted the bottom pot on the G008 to set the differential amplifier operating point to 8V. It was about 7.3V. I then adjusted the G007's balance pot as close to zero as possible. The offset moves in jumps with adjustment of the pot. The offset varied from -22 to 77 mV. On one of the cards the offset would jump from -170 to -46 to 77 mV as the pot was adjusted. I think this is due to moving from one winding on the pot to the next as it was adjusted and not due to a defect.

After this I set the slice threshold at 7.5V and tried the memory. It now seemed to work. Later after I got the memory test working I set the slice voltage to about middle of the range between failures. That was about 7.67 volts. The memory strobe timing looked fine so I didn't adjust that. I have not attempted the margin checking procedure in the memory alignment instructions yet which is used to fine tune the slice level. At this point I have a chicken and egg problem, do I run the memory test to verify all the memory is working before I try the processor tests or do I run the processor tests to verify it's working so I know the memory tests should run? I went with running the processor tests.

Toggle in tests

First I tried some simple toggle in tests using the front panel. I used some of these. The BSW instruction doesn't work on a straight 8 so I skipped that instruction in the tests. My notes aren't entirely clear but I think the instructions worked but the Teletype interface didn't. Fixing the R220 in the Teletype was previously discussed. Initially I got the Teletype interface working OK with a current loop to RS-232 adapter to my PC but not with my Teletype. Later I found one of the stop bits from the PDP-8 at the wrong level. After I fixed it and a problem in the Teletype it now works fine. After simple instruction tests and the Teletype echo test worked it was time to try the DEC MAINDEC diagnostics.

MAINDEC

Now that the processor and memory were working well enough I started running diagnostics. I dug around in my stuff and online and was able to come up with the following diagnostics. The first one I ran (maindec-08-d01a-b-pb) halted after a short while. After trying to find the full writeup and not finding it I used d8tape to disassemble the Maindec so I could figure out what was failing. After looking in memory it became obvious since right before the failing instructions the memory content was zero instead of the proper value. This was due to a bad transistor Q15 on G220 in MD8 that drove the one of the read/write lines during a write.

At this point loading tapes at 110 baud was getting old. I fixed a spare R401 and put it in place of the R405 in ME15. After adjusting it to get 9600 baud I then could load the tapes much faster. With more testing I am now at the point where the MAINDECS work except maindec-08-d1ac intermittently fails with memory locations getting cleared on power off.

Failed Card Information

Location

Card

Fault

Component replaced

PC17

R211

MA bit 10 failed

Q4

PA13

R210

AC bit 6 failed

Q12

ME22

R220

No Teletype output, TTO flip flop wouldn't stay set

D18 and D43

MD8

G209

Some memory locations always zero. Core X line not driven

Q15

MF14

S202

Teletype baud rate wrong, clock divider stage A to F didn't divide

D8, D23, D44, and D52

PB28

S203

Deposit to memory didn't work. Instruction register cleared

D2, D10, D13, and D16

PB36

S603

F output bad, kept run flip flop set

D43

PF28

S603

T output bad, kept run flip flop set

D4

PA19

S603

Rotate accumulator right (RAR) instruction didn't work, RAR signal not generated

D17

ME20

S202

Prevented Teletype interface from working, Pulls signal on L input (TTO enable) towards ground

D18, D35, D45, D47

MF15

S202

No Teletype clock, T output bad

D8 and D36

PC11

R211

No carry from PC 5 to PC 4 when incrementing PC.

D56. A slip with the scope probe required Q3 to be replaced.

PE31

S603

EAE logical right shift not working, right and left shift asserted at same time. T output toggling at strange levels

D2, D22, D43, and D48

PF23

S602

EAE logical right shift not working, right and left shift asserted at same time.

D20 and D31. D9, D15, and D30 removed since values seemed wrong but passed test off card and were reinstalled.

MD26

R650

BMB6(1) bad

Broken transistor lead. No schematic available to get Q number

ME15

R401

No clock output. This board was a spare that I used to replace the R405 for 9600 baud on the teletype interface. Bad solder joint.

PD34

R002

SP1 signal loaded down by bad diode. D7

MF20

S202

January 2013 (dated repairs are after machine was declared functional in May 2012).
IN Active flip flop wasn't clearing after receipt of a character. This prevented receiving more characters. One was the typical leaky diode where the forward and reverse voltage read low on the DVM. The other the forward voltage red high (1.2V).
D23, D24

PA23

R002

March 2015.
Found a bad low forward voltage diode when borrowing card to repair MARCH's straight 8. Unknown if it was causing problems.
D10

PC36

S203

January 2018.
MainDEC D05B was failing with zero being written to the JMS return address. Thought it may be due to interrupt handling and checked diodes on this card. Found 6 bad with low forward voltage drop. Replacing them fixed the problem.
Not determined

PB28

S111

March 2018.
Couldn't run RIM loader. IOT instructions were being decoded as JMS. IR1 was not getting set from MB1. Q3 read bad with transistor tester. After removal read ok. Likely the heating to remove temporarily fixed it. New transistor installed.
Q3

PC28

S203

August 2022.
Machine was executing wrong intermittently but had difficulty finding code to give consistent failure. Finally found BIN loader would get stuck on one instruction loading a tape so could troubleshoot. Issue was IR0 was getting cleared at end of fetch cycle so when it started defer cycle it executed the instruction wrong. In this case an indirect jump didn't jump. Replaced two bad diodes and bad transistor. One diode was cracked so had high forward voltage and split in half when unsoldered. The other had low forward voltage. The transistor read dead with my transistor tester.
Not determined

MF15

S202

February 2023.
Teletype interface intermittently seemed to have errors. Finally stopped transmitting to allow easy fault determination. Traced to no Teletype clock, T output not toggling. One diode read 1.4V foward drop. Replaced. Second repair for this card.

Not determined

ME21

S203

April 2023.
Teletype reader run signal stopped getting set. Found 4 bad diodes.

Not determined

How I physically repaired the cards is in the pictures at the bottom.

In repairing the cards that caused improper operation of the computer I decided to replace all components which measure out of specification. I did not try to determine if the component was actually causing improper operating unless replacing the out of spec components didn't fix the problem. I checked some cards which aren't exhibiting problems and they also seem to have diodes somewhat out of spec which I will not replace unless margin testing shows the card has problems.

All the transistors that needed to be replaced measured dead with my transistor tester. One diode measured open though would read OK when pressure applied. Two diodes measured normal with the diode check on the DVM but had sufficient leakage at higher reverse voltage to cause problems. The 24 other bad diodes measured different from good diode in either the forward or reverse direction. They did not necessarily read bad in both directions. The good diodes on most of the cards measured > .58 V for forward drop and > 1.7 volts reverse biased in circuit. After I got on a roll fixing cards the last one to fix (a S602) a couple diodes measured around .56 V forward drop and 1.4 V reverse biased. I removed them but they were really OK. The reading was from loading from other components on the card. I should have checked against another board before removing them.

All the bad diodes were DEC D-664. The bad transistors were one DEC-1008 and two DEC-3639B.

December 2012
The machine was acting somewhat strange where it stopping reading from the teletype paper tape reader occasionally and it corrupted the contents of the DF32 disk when I tried to load a program onto it. I found that bit 10 was always being written as a one to the disk. This problem was tracked down to the cable between the processor side and the memory side having a break in the MB10 signal. That line had a high resistance that varied with cable movement. The break appears to be at a little dimple in the cable on that conductor. Apparently at some point the cable got poked by something though it isn't clear what could have damaged it. Since I didn't see any clean way to repair the break in the cable I soldered a wirewrap wire between the PCB's at each end to connect the signal. The machine seems to be working again now. The teletype fault was due to a board failure. It acted up again a few weeks later. That repair is in the board repair list above.

February 2023
The front panel has been somewhat unreliable with deposit the worst and examine also acting up. The logic uses unlatched state of the switches so if they bounce too long the signal state changes during the cycle causing improper operation. You can see the PC end up with values other than increment by 1 and the wrong instruction light lit on the front panel. The bounce it too long for just trying to increase the debounce time to be useful. Cleaning the switches help at least for a while. I've done it previously but forgot to note. This is the latest cleaning. May have to give up and replace switches at some point.

March 2024
I bumped some of the margin switches and some went high resistance. Some I was able to use contact cleaners on and get acting ok. A few I bypassed with a wire to try to make sure I didn't have problems when I brought it to Rockville Science Day. Will need to figure out better solution.

June 2024
The front panel switches are still a problem. Replacing them is difficult due to the switch handle supports that are part of the switches. Since my opening and cleaning of the margin switches didn't last won't try that until I figure out how to make it last. Instead I decided add logic to debouce the switches better. I added a inline board to debouce the switches. I left the first one I made in the VCF straight 8 so I will need to make another for myself. Its too early to tell if it fully cures the problem but was able to toggle in a 21 instruction program without problems.

The following picture links also have descriptions of what is shown in the pictures.

Thumbnail Picture Selector

Current size small. Select picture size for links below   Small  Medium  Large

Initial Power On ( 91K)
Testing Card ( 63K)  Testing Signals ( 68K)
R220 Shift Register ( 97K)  R220 Front ( 93K)  Waveforms ( 18K)  Bad Diode Curves ( 11K)  Board Layout ( 51K)
Logic Analyzer PC Error ( 30K)  Logic Analyzer PC Fixed ( 29K)  Logic Analyzer Hooked up ( 64K)
Soldering equipment ( 78K)  Desoldered diodes ( 76K)  Spare Parts Kit ( 68K)
Wrong diodes? ( 72K)  Extra hole ( 64K)

Next cosmetic restoration Up to straight 8 restoration



Feel free to contact me, David Gesswein djg@pdp8online.com with any questions, comments on the web site, or if you have related equipment, documentation, software etc. you are willing to part with.  I am interested in anything PDP-8 related, computers, peripherals used with them, DEC or third party, or documentation. 

PDP-8 Home Page   PDP-8 Site Map   PDP-8 Site Search