In a series of short posts published in 2014 on the website of the Photomuseum Winterthur, artist and photographer Trevor Paglen elaborates his notion of «seeing machines». And he ponders: «What happens if we think about photography in terms of imaging systems instead of images?» The phrase ‹imaging systems› is interesting because it shifts the view – and our understanding of the concept – of what (photographic) images are from the end result (pictures) to the processes that produce this result and render it meaningful. From the image as a product to ‹imaging› as a socio-techno-cultural process. In his posts, Paglen delineates what he calls an «expansive definition of photography». In broad strokes he maps the terrain: «Seeing machines includes familiar photographic devices and categories like viewfinder cameras and photosensitive films and papers, but quickly moves far beyond that. It embraces everything from iPhones to airport security backscatter-imaging devices, from electro-optical reconnaissance satellites in low-earth orbit, to QR code readers at supermarket checkouts, from border checkpoint facial-recognition surveillance cameras to privatized networks of Automated License Plate Recognition systems, and from military wide-area-airborne-surveillance systems, to the roving cameras on board legions of Google’s ‹Street View› cars». Moreover, Paglen’s new definition contains not only the apparatus for ‹making› images, but also the resulting images themselves and the way they are interpreted by either humans or other machines or algorithms.
Thus, one might say, the definition of photography explicitly integrates narrative and rhetorical notions, which have always been part of photography’s discourse, but which were usually seen as contingent effects of the cultural use of photographs, rather than as constitutive of what photography is. In this expanded – or indeed ‹post-photographic› – concept, a photograph does not merely produce meaning; the image is also produced by the meanings we project onto it and by our understanding and use of the technologies and practices through which it becomes meaningful.
Paglen introduces the notion of ‹scripts› for this entanglement of technology and its cultural or economic or political or administrative use: «I think about a ‹script› as the basic and obvious function of an imaging system, its ‹style› of seeing, and the immediate relationships (between seer and seen, for example) it produces, and the obvious ways in which a seeing machine sculpts the world». He describes, for instance, the way an automated number plate reader (ANPR) «wants to see» the world – by not just photographing car number plates, but by connecting the data extracted from these images to information about the location of the vehicle, the owner and public or private records that will make this data meaningful in specific ways. Thus, states Paglen, «seeing machines create cultural, economic, and political footprints on society at large». The same goes for the common digital camera, I’d say, with its gridded viewfinder rectangle, which is part of a ‹script› that also includes Insta memes and YouTube tutorials that instruct camera users on how the medium, and not just the technical tool, «wants to see the world».
Grammars of action
Paglen’s notion of ‹scripts› closely resembles what computing and artificial intelligence scholar Philip Agre described as «grammars of action» within computer «capturing systems». Philip E. Agre, «Surveillance and Capture. Two Models of Privacy» (1994), in: The Information Society, 10:2, 101-127 Now Agre developed these concepts in the context of theoretical reflections on how computing systems deal with ‹information›, mostly within the framework of surveillance for administrative and business uses. But I think it is worthwhile to project these theoretical constructs onto how we see photographs and how we make and use them within narrative scripts which direct our interpretation of the pictures that, at the same time, serve as visible substantiation of the narrative.
In Agre’s terms, information is commonly seen as true – «that it corresponds in some transparent way to certain people, places, and things in the world». It is, as the etymology of the term ‹data› suggests, a ‹given›. Information, in order to be processed by computers, needs to have a certain structure, with rules that govern how each bit of information – each ‹given› – becomes meaningful in relation to others, so together they align into something we can make sense of and which we can accept as being ‹true›. These structures are called grammars in analogy to how grammar structures the alignment and variation of words in a correct sentence. Agre develops the concept of «grammars of action» to describe how computers (i.e. engineers and coders) structure data representing actions in the ‹real world›. In order to make sense of data ‹captured› from connected sources (cameras, sensors, other computers) or inputs (records of time, movement, location etc.), the computer program needs to be able to recognize specific objects, variables and relations between data. So the program projects a ‹grammar› onto what it ‹sees›. For our purposes, one could say that in photography, for instance, resolution, contrast and focus, among other things, determine to a large extent what the camera sees. Things smaller than the grains of the film or the pixels of the digital chip, contrasts below the threshold of differences in density that these granules or pixels can render etc. are not seen, just as things outside of the picture frame are not seen. Thus the camera ‹wants to see› the world in a framed, specifically conditioned way that the photographer needs to comply with – a grammar of action that fundamentally conditions the way we make and see photographs. Capture, therefore, in this case means a lot more than just mechanically registering light on film: it anticipates all the actions a photographer must perform in order to make a viable photo, and projects the criteria by which this photo will be judged as a ‹good› or ‹bad› photo back onto the photographer.
Agre stresses the impact of such grammars by pointing to «a kind of mythology» that is often constructed around them, «according to which the newly constructed grammar of action has not been ‹invented› but ‹discovered›. The activity in question, in other words, is said to have already been organized according to the grammar». Thus we say that photography, for instance, was merely ‹discovered› as a technology for rendering the world as we already see it. Agre, on the other hand, would insist that it rather constitutes «a reorganization of the existing activity, as opposed to simply a representation of it». This reorganization is what Paglen calls ‹script›. The scripts embedded within the very core of seeing machines reorganize the way we see the world, and impose this reorganized way of seeing on us, their users. In Agre’s context, the grammars of action that structure the way that computers make sense of data are projected back onto the computer’s users. If the user’s input does not match the program’s expectations, the computer says ‹no›. Seeing machines do something similar in structuring the way they ‹see› the world in a specific manner, which in turn reorganizes the way we experience it. A very funny example of this reorganization is Erik Kessels’ 2010 collection of attempts by amateurs to photograph their black dogs.Erik Kessels, In Almost Every Picture #9, Amsterdam 2010 The camera said ‹no› and produced vaguely dog-shaped black holes in pictures that, through this reorganization of the visible world, become quite uncanny.
Back to the cave painting. The digital photograph of the cave wall becomes a meaningful image for most of us only after we apply a specific grammar to its data that prioritizes certain aspects of the data and discards or reduces others. This reorganizes the way we see the image to the extent that we can now see the outlines of a hand. The story that accompanies the image convinces us that these outlines constitute a hand stencil that was left on the wall of the cave at least sixty-six thousand years ago. Could this image – both the hand stencil itself and the enhanced photograph of it – have been produced by any other means? Yes. Theoretically, the whole configuration of mineral residues could have been of a purely chemical nature, without the interference of any conscious acts by hominids. The chemical analysis and archeological argumentation of the scientists makes this theory very unlikely, though. More interestingly, the photo of the hand stencil could theoretically have been produced by other technologies. All kinds of sensors and scanning devices that ‹look for› specific wavelengths of reflected light or emitted radiation, for instance, could produce a similar or perhaps even better image than the enhanced photograph. We have, in short, developed quite an impressive array of seeing machines beyond the traditional camera, machines that we can use to translate any available data into images that pass for representations or ‹likenesses› of reality. Take the cave itself: we can combine ‹Lidar› or ‹Terrestrial Laser Scanning›, digital cameras, GPS data and animation software to map an interactive 3D model of the cave’s interior relative to its underground location, and get the experience of walking through it – similar to photos or films of it, but also quite different. An elaborate example of this ‹imaging› of subterranean space was done by a team of the National School of Surveying of the University of Otago, New Zealand, in scanning and modelling the tunnels and quarries built below the French town of Arras by New Zealand military engineers during the First World War. LiDARRAS project, 2017. A collaboration between the National School of Surveying, Otago, New Zealand; the École Supérieure des Géomètres et Topographes (ESGT Le Mans, France); the city of Arras; the Museum Carriere Wellington and alumni from Otago’s School of Mines.