A studio for ideas
Creative professionals need a place to develop their ideas. Digital authoring tools like Google Docs or Photoshop are designed to express ideas you’ve already developed, and productivity tools like email, calendar, or todo lists are good for tracking and administration. Surprisingly few digital tools exist for the early, freeform phase of ideation.
Building on the foundation of our earlier user research, we continue to explore tablet interfaces as a freeform thinking space. For this iteration, we took inspiration from physical spaces for idea gestation and crafting: libraries, workshops, drafting tables, and artists’ studios.
Inspiration from crafting spaces
Creative people tend to nest. A professor in their classroom, a writer in their home office, a woodworker in their shop, an artist in their studio—these spaces are full of work surfaces like desks and drafting tables, drawers and pegboards full of tools, pinboards and chalkboards and whiteboards, scraps of paper, photos, books, printouts, works-in-progress, post-its, and more. They are messy, informal, mixed together, freeform, and personal.
Compare these to our digital workspaces: what we see on the screens of our computers, tablets, and phones. Lists and grids of items, neatly arranged by the computer (perhaps sorted by date or name). Media types are often siloed (text notes in one place, web pages in another, PDFs and their highlights in still another) and difficult to view together. We don’t entirely choose what goes into our workspaces and things don’t tend to stay where we left them. In short, digital workspaces are structured, sterile, and impersonal.
Creativity is about making connections. This seems to demand a freeform, fluid space where creative fodder can be mixed together and sorted to the user’s liking. So why are freeform environments so rare in digital workspaces?
Physical and digital workspaces typically have different properties. We explore bringing physical notions to a digital tool.
Arranged by you
Arranged by the computer
Freeform, messy, informal, playful
Neat, sorted, structured, sterile
Write notes, draw sketches, underline, highlight anywhere on the page
Only make marks in spaces delineated as drawing canvases by the computer
Rip out pages, photocopy article excerpts, tape sketches and printouts to the wall, put things side-by-side
One document / page at a time (mobile)
Flip between document tabbed views
Arrange windows side-by-side (desktop)
Everything there was brought by you
Things stay where you left them
Computer refreshes web pages, reorders document lists, or suggests new content
Window or tab arrangement often lost between sessions
Photos in one app, PDFs in another, e-books in a third, text files elsewhere still
This led to our concept for this project. Inspired by physical workspaces, we set out to build a studio for ideas.
Based on the above comparison, we set out with the following design goals:
- A repository for creative inputs. The studio is a place to collect raw material as input to your thinking. This means everything together, with no media silos. If you get a critical mass of documents into the studio, connections will naturally form. These connections produce new ideas that can be captured in the studio—a virtuous cycle producing yet more fodder for generation of future ideas.
- Freeform and arranged by you. A spatial environment allows you to pull out excerpts, places things side by side, arrange them in a loose, freeform, fluid way.
- Multimedia. Text, photos, sketches, PDFs, web pages, diagrams—all together to reflect your variety in creative inputs and your own thinking.
- Inking everywhere. Drawings, annotations, marginalia. Not only can you see every media type together, but you can ink on everything without restriction. This supports everything from playful doodling to the serious study of active reading.
- Excerpting. In addition to freeform arrangement of entire documents, you may want to pull out just a sentence, a paragraph, a page, or a diagram from a longer document.
Added to these are some of Ink & Switch’s general design principles for digital tools:
- No chrome. Avoid toolbars, buttons, or other administrative debris. Just you and your work. The studio presents only a blank page, quietly inviting you to capture your thoughts and develop your ideas.
- Fast. The studio should support creators moving at their natural speed of thought. It should never break the connection between user inputs and the reactions they observe from the tool. As an industry, we’ve underestimated how harmful slow software is for the creative process. It’s so discouraging and uncomfortable to do creative work when you’re waiting, looking at spinners, unsure if your last action registered. The studio for ideas should have no spinners, no waiting, and run at 120 frames per second. Documents open instantly. No command requires press-and-hold or other built-in delays.
- Use the full command gesture space. As a fast, precise tool for creative professionals, we should use every capability at our disposal. That means all ten fingers, and the stylus as a distinct input. The stylus is required.
If the studio can achieve these goals, we hope it means you, the user, will be able to create more connections among ideas that before would have been siloed. That in turn means you can develop your ideas to the very fullest and thus reach your potential as a thinking, creating human.
A tour of Muse
Thus we set out to build Muse, a studio for ideas inspired by physical workspaces.
We have not achieved all the ideals set out in the previous section, but we did hit many of them. So now we’ll take a tour of the prototype and describe what we learned from seeing this in use by our team internally and in external users tests.
A space for creativity requires raw material. This material is the fodder for our minds to generate new ideas.
Thus our studio for ideas must make ingestion of different media types easy and painless. One ingestion route we focused on was users doing research on a web browser on their desktop computer.
The Muse web clipper is a Chrome extension that can capture fragments of text, images, whole web pages, or PDFs. Users right-click on web content and use “Clip to Muse” to clip a copy of that content to their tablet.
We designed the clipper so that users have zero choices when clipping. There are no questions about what format to clip, where to save the content etc. The user simply says “I want to keep this”. Any further steps here would make it harder for users to preserve content that’s important to them.
While we believe the Muse web clipper is useful, content ingestion was not the main focus of Muse and we don’t expect that the web clipper as-is would be sufficient for real use. We do think such a web extension would be one piece of the solution for a production-grade thinking tool.
The heart of the studio ideas is reading, sifting, sorting, and making sense of the raw material the user has ingested.
We chose a spatial layout for Muse: mixed-media cards organized with a boards metaphor.
There are some precedents for this type of interface. Milanote, for example, uses a cards-on-boards model. What’s unique about Muse is how we tried to design the interface to reflect and leverage all the subtle but important aspects of how people work in physical space.
Users can easily move, resize, create, delete, and copy cards in Muse.
We chose to use the stylus for moving and resizing because it’s more precise than fingers. This is a good example of where a stylus-required model gives greater flexibility in interaction design.
In terms of creating cards, we think it’s important for you to be able to start your creative work without ceremony. Our benchmark for ease is turning to a new page in a paper notebook. With that, no idea will be squashed by the hassle of getting started. In Muse we achieve this instant-start by dragging from the edge of the screen to create a new nested board.
Copying cards uses a related gesture, placing one finger on the card to be copied and pulling the copy out with the stylus.
Deletion is the inverse of creating—drag a card off the screen.
A subtle benefit of this design is that you can undo a creation in mid-flight; just drag the card off the edge.
The gestures can even be used to play around with your content, e.g. by creating a copy and throwing it away repeatedly. This may seem trivial, but it helps make the app feel more personal and less fragile to users, which we think encourages relaxed, creative thinking:
There are a few downsides to this gesture set. One is that the create gesture doesn’t obviously scale to multiple document types. If another document type was available in addition to boards—say, tables—we’d need some way to indicate which type to create. The copy gesture has some ambiguity between copy-by-value vs. copy-by-reference. The deletion gesture makes hard deletes too easy to do. We’d probably address this last issue with a combination of undo and changing delete to “archive” that sent documents to a staging area, as in Gmail.
Muse uses zoom-based navigation to leverage users’ innate spatial thinking, visual processing, and hand-eye coordination.
Any time a user initiates, adjusts, or releases a zooming pinch, the interface responds instantly. This gives a fluid feeling and builds the user’s confidence in the interface.
Relatedly, there are no discrete navigation jumps. As a consequence you can always initiate a partial movement and then gradually roll it forward and backwards until you reach some stable level. It also means you never lose your orientation, as every successive screen flows smoothly from the last (and you can always pinch in to get back).
One question we faced with a zooming interface was how to represent boards within boards. Just scaling down the board would make an inner board hard to distinguish as it’d be the same color as the surrounding board. To solve this we use a slight haze over distant boards, so that they appear darker the further away they are. This provides the necessary visual distinction between outer and inner boards without obscuring the content of the inner boards. It can also be incrementally animated as the zoom is in progress.
An important implication of the zooming design is that there’s no “opening” or “closing” documents. You just get close to them and they’re there, and when you’re far enough away they’re effectively invisible. This reflects how documents work in the real world; to read a piece of paper you just look at it. You don’t need to load your paper or open a paper-reading app first.
Users can hold cards they’re moving even as they’re zooming. This is an example of the interface taking advantage of the user using two hands—one for touch and one for the stylus—at the same time.
Zooming interfaces are not new. The first one was built by Ivan Sutherland in 1962. Zooming is familiar today in maps apps like Google Maps. Indeed we deliberately borrow the pinch and double-tap gestures from these apps. However, as far as we know, Muse’s combination of zooming interface, continuous design, concurrent stylus and hand gestures, and very high responsiveness is novel in a productivity tool.
Users quickly learned the interface and reported it made intuitive sense. We’d need more testing with full user corpuses to better assess the true potential of zooming user interfaces for document management, but we think they’re promising.
In physical and paper workspaces, artisans use a variety of techniques for keeping current work close at hand: desktop spaces (a metaphor used in most workstation platforms but absent in touch platforms), drawers that slide out, post-its or other items taped or pinned nearby, inboxes.
In Muse, we try to replicate this concept of an informal scratch space with something we call the shelf. The shelf can be summoned and dismissed with a swipe, and it moves with you as you zoom around the space.
One use of the shelf is receiving cards from the web clipper described above. Once an item lands on the shelf, you can move it to its desired location on your board.
The shelf also acts as a copy-paste buffer. It differs from a traditional copy-paste buffer in that a) the contents are visible to the user, b) it can hold multiple items, and c) the user can choose to use a subset of the items in the buffer.
The shelf can be adjusted at the same time as a card is being moved. This allows you to e.g. scoop up a card with a moving shelf, yank the shelf out from under a card to place it on a board, or do other compound movements.
The shelf supports zooming, similarly to the main view. Thus it’s possible to make a single item take up all of the shelf space, so the shelf can enable split-screen viewing of two items at the same time. An example of when this is useful is reading a PDF in the main view and taking notes in the shelf view.
Overall we think the shelf is a powerful construct, especially on mobile where such scratch space is traditionally lacking.
Annotation and inking
Pinboards, whiteboards, print books, post-its, and other forms of paper workspaces offer not just freeform arrangement of their contents, but also the ability to scribble anywhere. Highlighted or underlined passages, marginalia, even doodles in the margin help make an informal play space where ideas can flourish.
For this reason, our goal for Muse was simple: you can draw on anything.
On boards, inking can be used for e.g. free-form notes and drawings.
In addition to supporting inking-as-primary-content, we also wanted to leverage ink for annotations and metadata. On a board, for example, one could use ink to label cards.
In Muse it’s possible to write on any media type, not just boards. For example on PDFs, images, text, and web clippings.
Inking is an intuitive way to engage with content on the tablet form factor. We think supporting inking across all media types brings benefits both in terms of improving processing of the content with active reading and in the more concrete sense of being able to mark up the documents.
You can even take notes on the shelf. We sometimes use this in the same way you might use a sticky on your desk or notes in the back cover of a book.
An obvious question in an inking interface is how the ink looks. What style lines are used, with what thickness, and what color? Can these be changed? In Muse we’ve experimented with a highly opinionated approach where the tool supports only one ink type for any given media type, with each type carefully designed for the context it’ll be used in. The goal was to give users a consistently great inking experience without having to introduce chrome, modes, options, etc. that take up screen space and mental energy.
Overall we feel this approach worked pretty well. For lots of uses one ink style is all you need. Not seeing or thinking about ink style buttons, width sliders, etc. was great. The thing users most wanted choice on was color. Future design work could explore how to support a limited amount of choice in ink types (say, among a few colors) without creating the interface problems of tablet apps with arbitrarily customizable inking.
A last design consideration for ink is how to erase it. To engage the eraser a user presses one finger on the screen while using the stylus with a typical writing grip (see below for a full discussion of stylus tool selection). Additionally, the “size” of the eraser grows with the amount of pressure applied to the stylus. This avoids another bit of UI chrome (an eraser-size-selector) and proved easy for users to adopt:
PDFs, scanning, and active reading
For many of our target users, PDFs are a foundational media type. Professors review papers; architects use PDFs of floorplans; attorneys read and annotate legal documents. For this reason, support for PDFs was a key feature of Muse.
A useful property of physical books and papers is that you can scan them quickly. For example:
- Identify books at a glance, based on the color, layout, etc. of their cover.
- Flip through the book to get a sense of the content.
- Find a known image or annotation by scanning quickly through the pages.
- Open up to a random page and start reading to get a sample of the book.
- Judge how many pages there are at a glance, based on thickness.
We think these “scanning” actions are critical for users. They allow you to do important operations quickly and in a way that leverages our powerful visual pattern recognition facilities.
These operations are not possible in the classic computer PDF reader experience: a filename in small text next to a generic PDF icon, which when opened leads to a PDF reader that displays one page at a time with linear scrolling.
We wanted to support these scanning operations in Muse. We accomplished this with PDF overviews.
When viewed from the outside, these overviews identify the documents and give users get a sense of what to expect. The first page of the PDF is enlarged as it usually has important information (e.g. title, cover image). Users can also get a sense of the shape of the document—for example, how much images vs. text they contain—as well as see whether it is short or longer than one overview screen.
Within the overview, users can quickly scroll through the document. Once they find of page of interest, they can peek in smoothly to confirm it’s what they’re looking for, and either continue zooming in to the single-page reader or back out to the overview.
In user tests, the PDF overview was one of the most appreciated features. Users were often surprised as the overview is not like other tools they’ve used, but with some use they came to appreciate it.
One way to do active reading of a document is annotations: making notes or circling relevant bits. But another feature helpful here is excerpting (also known as highlighting, quoting, or transclusions).
For example you might want to excerpt facts you’ll be citing, graphics to send to a colleague, or references to chase down at the library. For this use case Muse provides first-class excerpts.
To create an excerpt, you drag across the PDF with a stylus using a low-angle grip (see below for more on grips). By default, this creates a snapshot-style excerpt, which is visually similar to what you’d get taking a screenshot over the same area. If you start the excerpt gesture on a piece of text, a text excerpt is created which pulls out the text content in a format-independent way.
Excerpts are first-class cards like boards, images, and their source PDFs. You can e.g. move them around and zoom in to them as full-screen items.
Excerpts also know where they came from, and can take you back to that context.
When you press the black dot on the bottom-right corner of an excerpt, you “wormhole” to the source PDF for that excerpt, and then to the individual page, with the excerpted area highlighted. You can then navigate around the source PDF as usual. If you pinch to go back, you’ll go first to the PDF overview page; a second pinch goes back to the page you warped from. Thus the invariant of pinching to go back where you were before is preserved. Coming in through the PDF overview also gives you some quick context on the source document as you’re entering.
Note that screenshot-style excerpts are currently just bitmaps. Thus they lose the underlying text / vector information present in their source PDF. This is an artifact of our initial implementation. One could imagine preserving the vector information in a PDF-like way in excerpts, perhaps even as a little PDF. Then users would be able to do things like copy text from a screenshot-style excerpt.
We designed excerpting to be orthogonal from annotation. So you can annotate a section without excerpting it, or visa versa, or you can annotate a section and then take an excerpt with that annotation. We wanted to give users control over these two axes independently. It’s tempting to combine these operations to save a step in the case the user wants both. But users often want to just e.g. annotate documents for their own active reading purposes, and separately excerpt key pieces they need for other work or to communicate with collaborators. We therefore think it’s appropriate for them to be orthogonal, though there may be more clever approaches here.
We think first-class excerpts are a promising concept. There’s more to iterate on here, for example you should probably be able to excerpt any document type, not just PDFs. But the basic idea of pulling out excerpts and being able to manipulate them with all the powers of your tool is appealing, as is the ability to reference back to the original source.
Perhaps the most powerful example of excerpting in a productivity tool is LiquidText, from which we draw inspiration. In Muse we’ve tried to combine such excerpting capabilities with the the design goal of managing all content types (including source documents and excerpts from them) in the same way, with the same tools (i.e. as movable cards on boards). We think this unified approach is promising but further user testing is needed.
Text is an important data type for creative work. In Muse, users create text cards by clipping a text string from a web page or by creating a text excerpt from a PDF. We tried a few design approaches with the resulting text cards in Muse.
One was to bold the first line of text cards. This provides a natural title without having to introduce markup, formatting buttons, etc. into the interface. Cards with this styling have more structure.
Another choice was to always apply a reasonable line width, regardless of the dimensions of the cards or level of zoom. This prevents unreadable text layouts, like lines that span the whole width of an iPad screen. Another effect of the stable layout is making it possible to annotate text with ink. Since the layout is always the same, it’s coherent to ink spatially on the rendered text. Ink wouldn’t make sense on text that constantly changed layout.
As the above video shows, text cards also don’t follow the spatial model used in the rest of the app; they don’t get bigger as you “zoom” in. This allowed us to read text on cards within boards. That’s helpful when the text cards are used as labels and captions. But breaking the spatial model also reduces consistency.
We think more experimentation is needed on representing text in an app like Muse. The question of sizing / zooming needs more thought, and we’d also like to consider rich text or Markdown, and how ink annotations would interact with editable text.
Stylus tool selection
The biggest unsolved design problem from our previous prototype was tool selection with the stylus. That is, how do you switch between functions like moving cards, inking, creating boards, and erasing?
There are two constraints that, if respected, make this stylus tool selection problem hard. The first constraint is not cluttering the user’s screen with chrome. In Muse we wanted 100% of the screen to be devoted to the user’s content at all times. So no tool palettes, buttons, menus, etc. (the obvious UIs for tool selection).
The second constraint is not creating a modal interface. By this we mean that the result of a gesture shouldn’t depend on state that isn’t the immediate focus of the user’s attention. Thus buttons on the top of the screen that toggled between (say) inking and erasing would be modal, because when the user goes to ink or erase their locus of attention will no longer be on the buttons, but on the page they’re about to draw on.
The solution we arrived at in Muse is to use physically distinct stylus grips—and resulting stylus altitudes—to provide more differentiation of gestures.
The key to this design is that the two grips are a) physically distinct, with a discrete motion required to move from one two the other, and b) force different altitudes for the stylus. This means that users can habituate to the different gestures (essentially, train muscle memory) and that Muse can differentiate between them based on altitude. We combine this differentiation with the content that the user is focusing on in the operation to give the following full set of operations in Muse.
A combination of context and gesture determine the operation invoked.
Stylus writing grip
Stylus writing grip + finger touching screen
Edge of board
Create / delete card
Middle of card
Corner of card
Middle of card
Low-angle grip + finger touching screen
After some time using the app, users develop habits around the different operations and move between them without conscious interruption.
Creating a modeless interface made a big difference in usability. In both Muse and our previous Capstone project, modal stylus interfaces inevitably left users frustrated. We tried variations on modal interfaces in Muse including the one below, they but they were all just unsatisfactory to varying degrees.
We think the main takeaway from this experiment in Muse is that modeless stylus interfaces are preferable to modal ones, and that they are possible in practice. A secondary lesson is that distinct stylus grips are a plausible mechanism for achieving this in future projects.
We expect that more exploration of modeless stylus interfaces will yield further options (especially if more hardware options become available, such as a second stylus for the device). Specific future designs will also depend on the exact set of operations desired, as each new operation threatens to exhaust the available gesture space and thereby require a total change of the design.
There are a few important problems we either didn’t get to address in Muse or for which we didn’t find a satisfactory answer. We hope to address these problems in future work. If you have insights on these or would like to collaborate, let us know!
Lost or broken styluses
One thing that the iPad Pro doesn’t solve is the problem of lost or broken styluses. Several of our test users had issues here: for example, one user’s stylus broke on a flight immediately before their user test. We think tools that require a stylus are still viable, but will be troublesome until platforms make it easier for users to always have a working stylus. The most recent iPad Pros do have a magnetic attachment for a stylus on the side of the device, which anecdotally helps with this problem.
Import / export
Right now it’s too hard to get documents into Muse. There are some obvious things we could do here: implementation of mobile share sheets for ingestion, a more complete browser extension, a desktop import tool, etc. In the other direction, work is currently trapped in Muse and can’t easily be used in a user’s ongoing creative work. For example, if a user makes free-hand notes in Muse, it’s hard to use those notes to compose an essay on a desktop. Again there are straightforward options here, like exporting from Muse to standard file formats.
Unification with the user’s corpus
That said, we think a future in which we create yet another data silo in an app like Muse is less interesting than one in which Muse natively integrates with users’ document storage systems of choice. Even with good import/export you are by definition constantly doing extra steps. Many of our target users use Dropbox or Google Drive to store their long-term creative work. So perhaps it’s more interesting for an app like Muse to integrate with these data stores than to constantly import/export from its own data silo.
No text input or editing
One area we didn’t attempt to address was entering or editing text. This is a huge missing capability. Although a user can take handwritten notes, there are many circumstances where a few words or a sentence of structured plaintext would be more suitable. Consider also the need to make a few edits, even just formatting cleanup, to an existing text excerpt or text imported from some other source.
Touchscreen-only devices are poor for text entry and manipulation. Options to address this include an onscreen virtual keyboard, a wireless or folio physical keyboard, voice dictation, and handwriting recognition. All of these have major problems or weaknesses and we anticipate this is a major problem that must be solved to make a useful studio for ideas on a tablet.
We’ll address this in future research. Consider this also an invitation to other researchers to tackle the problem; if you’ve done so, we’d love to hear from you.
Phone and desktop apps
A full-powered creative suite will work across multiple devices. Different form factors are better for different types of work, and the best tool sets will leverage that. For example, Muse shows how tablets are great for reading, annotating, and free-form ideating. However, phones are better for on-the-go lookups, and desktops are still best for complex research, authoring, and editing workflows.
With Muse, we mostly had to deal with one device (the web clipper is sort of a second device, but was minimal). To extend to multiple devices, there’s an obvious technical challenge of synchronizing state. Our lab has worked on this problem for some time (e.g. with Automerge), but it is far from solved. Another challenging problem is the UI: the spatial interface worked well on a tablet, but what would it mean for other devices? A phone is probably too small to legibly render a whole board. A desktop could render a board, but power users want to use keyboard shortcuts there, which don’t obviously translate to spatial UIs.
Is the answer to this challenge to implement a different metaphor for each form factor, according to the strengths of the device? For example, touch-based spatial navigation on tablets, simple lists on phones, and keyboard-navigable pages on desktops? This might optimize the interaction for each device, but also leave the user with an inconsistent mental model across their devices, inhibiting fluid thinking. But it would also be disappointing if the answer was a least-common-denominator approach across the different form factors.
Annotating movable content
In Muse it’s possible to annotate / ink on any content type, including boards, images, PDFs, web pages, and text documents. For something like PDFs, which visually render in the same way everywhere, inking annotations make total sense. For example, if a user draws a red line in the upper-right corner of page 3 of a PDF on their tablet, it’s clear how to render that annotation on another tablet, regardless of it’s screen size, orientation, etc.
It’s less obvious what to do in general about e.g. web pages, because how these documents render depends on the user’s device: their screen size and resolution, current font size, etc. So if a user annotates a web page with a line on one device, how should it be rendered elsewhere? There isn’t an obvious answer. In Muse we’ve avoided this problem by assuming the web page will only ever be rendered on one device (the user’s tablet) and thus that it will be rendered the same every time.
A similar problem exists for text. Text can be rendered in different ways on different devices. Users also expect text to change (e.g. if you edit it in a text editor), providing another source of disorientation for the original annotations.
We think being able to annotate any document type is useful, but it will require careful design to be deployed in more general systems. Perhaps the answer is simply to freeze or copy any document before it is annotated. There may be better answers to this design problem, though.
Scalable tool selection
We described earlier solutions for interpreting touch and stylus gestures without resorting to modes or interface chrome. But these solutions are sensitive to the total number of desired operations. If we wanted to add even one more operation to the tool, we may need to totally redesign the interface / gesture system. Even something as simple as a “settings menu” doesn’t have an obvious place in our design right now.
We expect production-grade tools will require more operations than Muse currently supports. One way to support them would be to continue to design bespoke interfaces as operations are added, but there may be more scalable approaches. For example, in keyboard interfaces, the pattern of some set of modifier keys being used with single letter keys provides potential shortcuts for tens or even hundreds of operations. It may be desirable to find such an interface for touch devices. Options to explore include chording and auxiliary mechanical modifier keys.
A freeform digital space for idea development could be a boon to creative people. With the Muse studio for ideas, we sought to move away from the common qualities of digital workspaces (structured, neat, and sterile) and to capture some of the qualities of physical crafting spaces (messy, informal, and personal).
We think this project demonstrates promise along these lines, but many questions remain open. What do you think: did we achieve our goals, or do you think this goal is even worthwhile in the first place? Or perhaps you are working on a related project with your own insights to share. Either way we’d love to hear from you: @inkandswitch or [email protected].
Update August 2019: Based on the positive response to this article, we’re working on Muse as a full product. If you’d like to be a beta tester, sign up here.